cudf.Series.value_counts#

Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)#

Return a Series containing counts of unique values.

The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.

Parameters:

normalizebool, default False: If True then the object returned will contain the relative frequencies of the unique values.
sortbool, default True: Sort by frequencies.
ascendingbool, default False: Sort in ascending order.
binsint, optional: Rather than count values, group them into half-open bins, only works with numeric data.
dropnabool, default True: Don’t include counts of NaN and None.

Returns:

resultSeries containing counts of unique values.

See also

Series.count: Number of non-NA elements in a Series.
cudf.DataFrame.count: Number of non-NA elements in a DataFrame.

Examples

>>> import cudf
>>> sr = cudf.Series([1.0, 2.0, 2.0, 3.0, 3.0, 3.0, None])
>>> sr
0     1.0
1     2.0
2     2.0
3     3.0
4     3.0
5     3.0
6    <NA>
dtype: float64
>>> sr.value_counts()
3.0    3
2.0    2
1.0    1
Name: count, dtype: int64

The order of the counts can be changed by passing ascending=True:

>>> sr.value_counts(ascending=True)
1.0    1
2.0    2
3.0    3
Name: count, dtype: int64

With normalize set to True, returns the relative frequency by dividing all values by the sum of values.

>>> sr.value_counts(normalize=True)
3.0    0.500000
2.0    0.333333
1.0    0.166667
Name: proportion, dtype: float64

To include NA value counts, pass dropna=False:

>>> sr = cudf.Series([1.0, 2.0, 2.0, 3.0, None, 3.0, 3.0, None])
>>> sr
0     1.0
1     2.0
2     2.0
3     3.0
4    <NA>
5     3.0
6     3.0
7    <NA>
dtype: float64
>>> sr.value_counts(dropna=False)
3.0     3
2.0     2
<NA>    2
1.0     1
Name: count, dtype: int64

>>> s = cudf.Series([3, 1, 2, 3, 4, np.nan])
>>> s.value_counts(bins=3)
(2.0, 3.0]      2
(0.996, 2.0]    2
(3.0, 4.0]      1
Name: count, dtype: int64