cudf.core.groupby.groupby.DataFrameGroupBy.describe#

DataFrameGroupBy.describe(include=None, exclude=None)#

Generate descriptive statistics that summarizes the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.

Analyzes numeric DataFrames only

Parameters:

include: ‘all’, list-like of dtypes or None (default), optional: list of data types to include in the result. Ignored for Series.
exclude: list-like of dtypes or None (default), optional,: list of data types to omit from the result. Ignored for Series.

Returns:

Series or DataFrame: Summary statistics of the Dataframe provided.

Examples

>>> import cudf
>>> gdf = cudf.DataFrame({
...     "Speed": [380.0, 370.0, 24.0, 26.0],
...      "Score": [50, 30, 90, 80],
... })
>>> gdf
   Speed  Score
0  380.0     50
1  370.0     30
2   24.0     90
3   26.0     80
>>> gdf.groupby('Score').describe()
     Speed
     count   mean   std    min    25%    50%    75%     max
Score
30        1  370.0  <NA>  370.0  370.0  370.0  370.0  370.0
50        1  380.0  <NA>  380.0  380.0  380.0  380.0  380.0
80        1   26.0  <NA>   26.0   26.0   26.0   26.0   26.0
90        1   24.0  <NA>   24.0   24.0   24.0   24.0   24.0