cudf.Series.value_counts#
- Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)[source]#
Return a Series containing counts of unique values.
The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.
- Parameters:
- normalizebool, default False
If True then the object returned will contain the relative frequencies of the unique values.
- sortbool, default True
Sort by frequencies.
- ascendingbool, default False
Sort in ascending order.
- binsint, optional
Rather than count values, group them into half-open bins, only works with numeric data.
- dropnabool, default True
Don’t include counts of NaN and None.
- Returns:
- resultSeries containing counts of unique values.
See also
Series.count
Number of non-NA elements in a Series.
cudf.DataFrame.count
Number of non-NA elements in a DataFrame.
Examples
>>> import cudf >>> sr = cudf.Series([1.0, 2.0, 2.0, 3.0, 3.0, 3.0, None]) >>> sr 0 1.0 1 2.0 2 2.0 3 3.0 4 3.0 5 3.0 6 <NA> dtype: float64 >>> sr.value_counts() 3.0 3 2.0 2 1.0 1 Name: count, dtype: int64
The order of the counts can be changed by passing
ascending=True
:>>> sr.value_counts(ascending=True) 1.0 1 2.0 2 3.0 3 Name: count, dtype: int64
With
normalize
set to True, returns the relative frequency by dividing all values by the sum of values.>>> sr.value_counts(normalize=True) 3.0 0.500000 2.0 0.333333 1.0 0.166667 Name: proportion, dtype: float64
To include
NA
value counts, passdropna=False
:>>> sr = cudf.Series([1.0, 2.0, 2.0, 3.0, None, 3.0, 3.0, None]) >>> sr 0 1.0 1 2.0 2 2.0 3 3.0 4 <NA> 5 3.0 6 3.0 7 <NA> dtype: float64 >>> sr.value_counts(dropna=False) 3.0 3 2.0 2 <NA> 2 1.0 1 Name: count, dtype: int64
>>> s = cudf.Series([3, 1, 2, 3, 4, np.nan]) >>> s.value_counts(bins=3) (2.0, 3.0] 2 (0.996, 2.0] 2 (3.0, 4.0] 1 Name: count, dtype: int64