GroupBy#

GroupBy objects are returned by groupby calls: cudf.DataFrame.groupby(), cudf.Series.groupby(), etc.

Indexing, iteration#

GroupBy.__iter__()

GroupBy.groups

Returns a dictionary mapping group keys to row labels.

Grouper([key, level, freq, closed, label])

Function application#

GroupBy.apply(function, *args[, engine])

Apply a python transformation function over the grouped chunk.

GroupBy.agg(func)

Apply aggregation(s) to the groups.

SeriesGroupBy.aggregate(func)

Apply aggregation(s) to the groups.

DataFrameGroupBy.aggregate(func)

Apply aggregation(s) to the groups.

GroupBy.pipe(func, *args, **kwargs)

Apply a function func with arguments to this GroupBy object and return the function's result.

GroupBy.transform(function)

Apply an aggregation, then broadcast the result to the group size.

Computations / descriptive stats#

GroupBy.bfill([limit])

Backward fill NA values.

GroupBy.backfill([limit])

Backward fill NA values.

GroupBy.count([dropna])

Compute the number of values in each column.

GroupBy.cumcount()

Return the cumulative count of keys in each group.

GroupBy.cummax(*args, **kwargs)

Cumulative max for each group.

GroupBy.cummin(*args, **kwargs)

Cumulative min for each group.

GroupBy.cumsum(*args, **kwargs)

Cumulative sum for each group.

GroupBy.diff([periods, axis])

Get the difference between the values in each group.

GroupBy.ffill([limit])

Forward fill NA values.

GroupBy.first([numeric_only, min_count])

Compute first of group values.

GroupBy.get_group(name[, obj])

Construct DataFrame from group with provided name.

GroupBy.groups

Returns a dictionary mapping group keys to row labels.

GroupBy.idxmax([numeric_only, min_count])

Compute idxmax of group values.

GroupBy.idxmin([numeric_only, min_count])

Compute idxmin of group values.

GroupBy.last([numeric_only, min_count])

Compute last of group values.

GroupBy.max([numeric_only, min_count])

Compute max of group values.

GroupBy.mean([numeric_only, min_count])

Compute mean of group values.

GroupBy.median([numeric_only, min_count])

Compute median of group values.

GroupBy.min([numeric_only, min_count])

Compute min of group values.

GroupBy.ngroup([ascending])

Number each group from 0 to the number of groups - 1.

GroupBy.nth(n)

Return the nth row from each group.

GroupBy.nunique([numeric_only, min_count])

Compute nunique of group values.

GroupBy.pad([limit])

Forward fill NA values.

GroupBy.prod([numeric_only, min_count])

Compute prod of group values.

GroupBy.shift([periods, freq, axis, fill_value])

Shift each group by periods positions.

GroupBy.size()

Return the size of each group.

GroupBy.std([ddof])

Compute the column-wise std of the values in each group.

GroupBy.sum([numeric_only, min_count])

Compute sum of group values.

GroupBy.var([ddof])

Compute the column-wise variance of the values in each group.

GroupBy.corr([method, min_periods])

Compute pairwise correlation of columns, excluding NA/null values.

GroupBy.cov([min_periods, ddof])

Compute the pairwise covariance among the columns of a DataFrame, excluding NA/null values.

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.backfill([limit])

Backward fill NA values.

DataFrameGroupBy.bfill([limit])

Backward fill NA values.

DataFrameGroupBy.count([dropna])

Compute the number of values in each column.

DataFrameGroupBy.cumcount()

Return the cumulative count of keys in each group.

DataFrameGroupBy.cummax(*args, **kwargs)

Cumulative max for each group.

DataFrameGroupBy.cummin(*args, **kwargs)

Cumulative min for each group.

DataFrameGroupBy.cumsum(*args, **kwargs)

Cumulative sum for each group.

DataFrameGroupBy.describe([include, exclude])

Generate descriptive statistics that summarizes the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values.

DataFrameGroupBy.diff([periods, axis])

Get the difference between the values in each group.

DataFrameGroupBy.ffill([limit])

Forward fill NA values.

DataFrameGroupBy.fillna([value, method, ...])

Fill NA values using the specified method.

DataFrameGroupBy.idxmax([numeric_only, ...])

Compute idxmax of group values.

DataFrameGroupBy.idxmin([numeric_only, ...])

Compute idxmin of group values.

DataFrameGroupBy.nunique([numeric_only, ...])

Compute nunique of group values.

DataFrameGroupBy.pad([limit])

Forward fill NA values.

DataFrameGroupBy.quantile([q, interpolation])

Compute the column-wise quantiles of the values in each group.

DataFrameGroupBy.shift([periods, freq, ...])

Shift each group by periods positions.

DataFrameGroupBy.size()

Return the size of each group.

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.nunique([numeric_only, min_count])

Compute nunique of group values.

SeriesGroupBy.unique()

Get a list of the unique values for each column in each group.