GroupBy#

DataFrameGroupBy and SeriesGroupBy instances are returned by groupby calls cudf.DataFrame.groupby() and cudf.Series.groupby() respectively.

Indexing, iteration#

`DataFrameGroupBy.__iter__`()
`SeriesGroupBy.__iter__`()
`DataFrameGroupBy.groups`	Returns a dictionary mapping group keys to row labels.
`SeriesGroupBy.groups`	Returns a dictionary mapping group keys to row labels.
`DataFrameGroupBy.indices`	Dict {group name -> group indices}.
`SeriesGroupBy.indices`	Dict {group name -> group indices}.
`DataFrameGroupBy.get_group`(name[, obj])	Construct DataFrame from group with provided name.
`SeriesGroupBy.get_group`(name[, obj])	Construct DataFrame from group with provided name.

Function application#

`SeriesGroupBy.apply`(func, args, *kwargs)	Apply a python transformation function over the grouped chunk.
`DataFrameGroupBy.apply`(func, *args[, ...])	Apply a python transformation function over the grouped chunk.
`SeriesGroupBy.agg`(func, *args[, engine, ...])	Apply aggregation(s) to the groups.
`DataFrameGroupBy.agg`([func, engine, ...])	Apply aggregation(s) to the groups.
`SeriesGroupBy.aggregate`(func, *args[, ...])	Apply aggregation(s) to the groups.
`DataFrameGroupBy.aggregate`([func, engine, ...])	Apply aggregation(s) to the groups.
`SeriesGroupBy.transform`(func, *args[, ...])	Apply an aggregation, then broadcast the result to the group size.
`DataFrameGroupBy.transform`(func, *args[, ...])	Apply an aggregation, then broadcast the result to the group size.
`SeriesGroupBy.pipe`(func, args, *kwargs)	Apply a function func with arguments to this GroupBy object and return the function's result.
`DataFrameGroupBy.pipe`(func, args, *kwargs)	Apply a function func with arguments to this GroupBy object and return the function's result.
`DataFrameGroupBy.filter`(func[, dropna])	Filter elements from groups that don't satisfy a criterion.
`SeriesGroupBy.filter`(func[, dropna])	Filter elements from groups that don't satisfy a criterion.

`DataFrameGroupBy` computations / descriptive stats#

`DataFrameGroupBy.all`([skipna])	Return True if all values in the group are truthful, else False.
`DataFrameGroupBy.any`([skipna])	Return True if any value in the group is truthful, else False.
`DataFrameGroupBy.bfill`([limit])	Backward fill NA values.
`DataFrameGroupBy.corr`([method, min_periods, ...])	Compute pairwise correlation of columns, excluding NA/null values.
`DataFrameGroupBy.count`([dropna])	Compute the number of values in each column.
`DataFrameGroupBy.cov`([min_periods, ddof, ...])	Compute the pairwise covariance among the columns of a DataFrame, excluding NA/null values.
`DataFrameGroupBy.cumcount`([ascending])	Return the cumulative count of keys in each group.
`DataFrameGroupBy.cummax`(args, *kwargs)	Cumulative max for each group.
`DataFrameGroupBy.cummin`(args, *kwargs)	Cumulative min for each group.
`DataFrameGroupBy.cumprod`(args, *kwargs)	Cumulative product for each group.
`DataFrameGroupBy.cumsum`(args, *kwargs)	Cumulative sum for each group.
`DataFrameGroupBy.describe`([percentiles, ...])	Generate descriptive statistics that summarizes the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values.
`DataFrameGroupBy.diff`([periods, axis])	Get the difference between the values in each group.
`DataFrameGroupBy.ewm`(args, *kwargs)	Return an ewm grouper, providing ewm functionality per group.
`DataFrameGroupBy.expanding`(args, *kwargs)	Return an expanding grouper, providing expanding functionality per group.
`DataFrameGroupBy.ffill`([limit])	Forward fill NA values.
`DataFrameGroupBy.first`([numeric_only, min_count])	Compute first of group values.
`DataFrameGroupBy.head`([n, preserve_order])	Return first n rows of each group
`DataFrameGroupBy.idxmax`([numeric_only, ...])	Compute idxmax of group values.
`DataFrameGroupBy.idxmin`([numeric_only, ...])	Compute idxmin of group values.
`DataFrameGroupBy.last`([numeric_only, min_count])	Compute last of group values.
`DataFrameGroupBy.max`([numeric_only, min_count])	Compute max of group values.
`DataFrameGroupBy.mean`([numeric_only, min_count])	Compute mean of group values.
`DataFrameGroupBy.median`([numeric_only, ...])	Compute median of group values.
`DataFrameGroupBy.min`([numeric_only, min_count])	Compute min of group values.
`DataFrameGroupBy.ngroup`([ascending])	Number each group from 0 to the number of groups - 1.
`DataFrameGroupBy.nth`(n[, dropna])	Return the nth row from each group.
`DataFrameGroupBy.nunique`([dropna])	Return number of unique elements in the group.
`DataFrameGroupBy.ohlc`()	Compute open, high, low and close values of a group, excluding missing values.
`DataFrameGroupBy.pct_change`([periods, ...])	Calculates the percent change between sequential elements in the group.
`DataFrameGroupBy.prod`([numeric_only, min_count])	Compute prod of group values.
`DataFrameGroupBy.quantile`([q, ...])	Compute the column-wise quantiles of the values in each group.
`DataFrameGroupBy.rank`([method, ascending, ...])	Return the rank of values within each group.
`DataFrameGroupBy.resample`(rule, *args[, ...])	Provide resampling when using a TimeGrouper.
`DataFrameGroupBy.rolling`(args, *kwargs)	Returns a RollingGroupby object that enables rolling window calculations on the groups.
`DataFrameGroupBy.sample`([n, frac, replace, ...])	Return a random sample of items in each group.
`DataFrameGroupBy.shift`([periods, freq, ...])	Shift each group by `periods` positions.
`DataFrameGroupBy.size`()	Return the size of each group.
`DataFrameGroupBy.std`([ddof, engine, ...])	Compute the column-wise std of the values in each group.
`DataFrameGroupBy.sum`([numeric_only, min_count])	Compute sum of group values.
`DataFrameGroupBy.var`([ddof, engine, ...])	Compute the column-wise variance of the values in each group.
`DataFrameGroupBy.tail`([n, preserve_order])	Return last n rows of each group
`DataFrameGroupBy.take`(indices)	Return the elements in the given positional indices in each group.
`DataFrameGroupBy.value_counts`([subset, ...])	Return a Series or DataFrame containing counts of unique rows.

`SeriesGroupBy` computations / descriptive stats#

`SeriesGroupBy.all`([skipna])	Return True if all values in the group are truthful, else False.
`SeriesGroupBy.any`([skipna])	Return True if any value in the group is truthful, else False.
`SeriesGroupBy.bfill`([limit])	Backward fill NA values.
`SeriesGroupBy.corr`(other[, method, min_periods])
`SeriesGroupBy.count`([dropna])	Compute the number of values in each column.
`SeriesGroupBy.cov`([min_periods, ddof, ...])	Compute the pairwise covariance among the columns of a DataFrame, excluding NA/null values.
`SeriesGroupBy.cumcount`([ascending])	Return the cumulative count of keys in each group.
`SeriesGroupBy.cummax`(args, *kwargs)	Cumulative max for each group.
`SeriesGroupBy.cummin`(args, *kwargs)	Cumulative min for each group.
`SeriesGroupBy.cumprod`(args, *kwargs)	Cumulative product for each group.
`SeriesGroupBy.cumsum`(args, *kwargs)	Cumulative sum for each group.
`SeriesGroupBy.describe`([percentiles, ...])	Generate descriptive statistics that summarizes the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values.
`SeriesGroupBy.diff`([periods, axis])	Get the difference between the values in each group.
`SeriesGroupBy.ewm`(args, *kwargs)	Return an ewm grouper, providing ewm functionality per group.
`SeriesGroupBy.expanding`(args, *kwargs)	Return an expanding grouper, providing expanding functionality per group.
`SeriesGroupBy.ffill`([limit])	Forward fill NA values.
`SeriesGroupBy.first`([numeric_only, min_count])	Compute first of group values.
`SeriesGroupBy.head`([n, preserve_order])	Return first n rows of each group
`SeriesGroupBy.last`([numeric_only, min_count])	Compute last of group values.
`SeriesGroupBy.idxmax`([numeric_only, min_count])	Compute idxmax of group values.
`SeriesGroupBy.idxmin`([numeric_only, min_count])	Compute idxmin of group values.
`SeriesGroupBy.is_monotonic_increasing`	Return whether each group's values are monotonically increasing.
`SeriesGroupBy.is_monotonic_decreasing`	Return whether each group's values are monotonically decreasing.
`SeriesGroupBy.max`([numeric_only, min_count])	Compute max of group values.
`SeriesGroupBy.mean`([numeric_only, min_count])	Compute mean of group values.
`SeriesGroupBy.median`([numeric_only, min_count])	Compute median of group values.
`SeriesGroupBy.min`([numeric_only, min_count])	Compute min of group values.
`SeriesGroupBy.ngroup`([ascending])	Number each group from 0 to the number of groups - 1.
`SeriesGroupBy.nlargest`([n, keep])	Return the largest n elements.
`SeriesGroupBy.nsmallest`([n, keep])	Return the smallest n elements.
`SeriesGroupBy.nth`(n[, dropna])	Return the nth row from each group.
`SeriesGroupBy.nunique`([dropna])	Return number of unique elements in the group.
`SeriesGroupBy.unique`()	Get a list of the unique values for each column in each group.
`SeriesGroupBy.ohlc`()	Compute open, high, low and close values of a group, excluding missing values.
`SeriesGroupBy.pct_change`([periods, ...])	Calculates the percent change between sequential elements in the group.
`SeriesGroupBy.prod`([numeric_only, min_count])	Compute prod of group values.
`SeriesGroupBy.quantile`([q, interpolation, ...])	Compute the column-wise quantiles of the values in each group.
`SeriesGroupBy.rank`([method, ascending, ...])	Return the rank of values within each group.
`SeriesGroupBy.resample`(rule, *args[, ...])	Provide resampling when using a TimeGrouper.
`SeriesGroupBy.rolling`(args, *kwargs)	Returns a RollingGroupby object that enables rolling window calculations on the groups.
`SeriesGroupBy.sample`([n, frac, replace, ...])	Return a random sample of items in each group.
`SeriesGroupBy.shift`([periods, freq, axis, ...])	Shift each group by `periods` positions.
`SeriesGroupBy.size`()	Return the size of each group.
`SeriesGroupBy.std`([ddof, engine, ...])	Compute the column-wise std of the values in each group.
`SeriesGroupBy.sum`([numeric_only, min_count])	Compute sum of group values.
`SeriesGroupBy.var`([ddof, engine, ...])	Compute the column-wise variance of the values in each group.
`SeriesGroupBy.tail`([n, preserve_order])	Return last n rows of each group
`SeriesGroupBy.take`(indices)	Return the elements in the given positional indices in each group.
`SeriesGroupBy.value_counts`([normalize, ...])

Plotting and visualization#

`DataFrameGroupBy.boxplot`([subplots, column, ...])
`DataFrameGroupBy.hist`([column, by, grid, ...])
`SeriesGroupBy.hist`([by, ax, grid, ...])
`DataFrameGroupBy.plot`	Make plots of a grouped Series or DataFrame.
`SeriesGroupBy.plot`	Make plots of a grouped Series or DataFrame.