cudf.core.column.string.StringMethods.count#

StringMethods.count(pat: str, flags: int = 0) SeriesOrIndex[source]#

Count occurrences of pattern in each string of the Series/Index.

This function is used to count the number of times a particular regex pattern is repeated in each of the string elements of the Series.

Parameters:
patstr or compiled regex

Valid regular expression.

flagsint, default 0 (no flags)

Flags to pass through to the regex engine (e.g. re.MULTILINE)

Returns:
Series or Index

Examples

>>> import cudf
>>> s = cudf.Series(['A', 'B', 'Aaba', 'Baca', None, 'CABA', 'cat'])
>>> s.str.count('a')
0       0
1       0
2       2
3       2
4    <NA>
5       0
6       1
dtype: int32

Escape '$' to find the literal dollar sign.

>>> s = cudf.Series(['$', 'B', 'Aab$', '$$ca', 'C$B$', 'cat'])
>>> s.str.count('\$')
0    1
1    0
2    1
3    2
4    2
5    0
dtype: int32

This is also available on Index.

>>> index = cudf.Index(['A', 'A', 'Aaba', 'cat'])
>>> index.str.count('a')
Index([0, 0, 2, 1], dtype='int64')

Pandas Compatibility Note

pandas.Series.str.count()

  • flags parameter currently only supports re.DOTALL and re.MULTILINE.

  • Some characters need to be escaped when passing in pat. e.g. '$' has a special meaning in regex and must be escaped when finding this literal character.