GroupBy#

DataFrameGroupBy and SeriesGroupBy instances are returned by groupby calls cudf.DataFrame.groupby() and cudf.Series.groupby() respectively.

Indexing, iteration#

DataFrameGroupBy.__iter__()

SeriesGroupBy.__iter__()

DataFrameGroupBy.groups

Returns a dictionary mapping group keys to row labels.

SeriesGroupBy.groups

Returns a dictionary mapping group keys to row labels.

DataFrameGroupBy.indices

Dict {group name -> group indices}.

SeriesGroupBy.indices

Dict {group name -> group indices}.

DataFrameGroupBy.get_group(name[, obj])

Construct DataFrame from group with provided name.

SeriesGroupBy.get_group(name[, obj])

Construct DataFrame from group with provided name.

Function application#

SeriesGroupBy.apply(func, *args, **kwargs)

Apply a python transformation function over the grouped chunk.

DataFrameGroupBy.apply(func, *args[, ...])

Apply a python transformation function over the grouped chunk.

SeriesGroupBy.agg(func, *args[, engine, ...])

Apply aggregation(s) to the groups.

DataFrameGroupBy.agg([func, engine, ...])

Apply aggregation(s) to the groups.

SeriesGroupBy.aggregate(func, *args[, ...])

Apply aggregation(s) to the groups.

DataFrameGroupBy.aggregate([func, engine, ...])

Apply aggregation(s) to the groups.

SeriesGroupBy.transform(func, *args[, ...])

Apply an aggregation, then broadcast the result to the group size.

DataFrameGroupBy.transform(func, *args[, ...])

Apply an aggregation, then broadcast the result to the group size.

SeriesGroupBy.pipe(func, *args, **kwargs)

Apply a function func with arguments to this GroupBy object and return the function's result.

DataFrameGroupBy.pipe(func, *args, **kwargs)

Apply a function func with arguments to this GroupBy object and return the function's result.

DataFrameGroupBy.filter(func[, dropna])

Filter elements from groups that don't satisfy a criterion.

SeriesGroupBy.filter(func[, dropna])

Filter elements from groups that don't satisfy a criterion.

DataFrameGroupBy computations / descriptive stats#

DataFrameGroupBy.all([skipna])

Return True if all values in the group are truthful, else False.

DataFrameGroupBy.any([skipna])

Return True if any value in the group is truthful, else False.

DataFrameGroupBy.bfill([limit])

Backward fill NA values.

DataFrameGroupBy.corr([method, min_periods, ...])

Compute pairwise correlation of columns, excluding NA/null values.

DataFrameGroupBy.count([dropna])

Compute the number of values in each column.

DataFrameGroupBy.cov([min_periods, ddof, ...])

Compute the pairwise covariance among the columns of a DataFrame, excluding NA/null values.

DataFrameGroupBy.cumcount([ascending])

Return the cumulative count of keys in each group.

DataFrameGroupBy.cummax(*args, **kwargs)

Cumulative max for each group.

DataFrameGroupBy.cummin(*args, **kwargs)

Cumulative min for each group.

DataFrameGroupBy.cumprod(*args, **kwargs)

Cumulative product for each group.

DataFrameGroupBy.cumsum(*args, **kwargs)

Cumulative sum for each group.

DataFrameGroupBy.describe([percentiles, ...])

Generate descriptive statistics that summarizes the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values.

DataFrameGroupBy.diff([periods, axis])

Get the difference between the values in each group.

DataFrameGroupBy.ewm(*args, **kwargs)

Return an ewm grouper, providing ewm functionality per group.

DataFrameGroupBy.expanding(*args, **kwargs)

Return an expanding grouper, providing expanding functionality per group.

DataFrameGroupBy.ffill([limit])

Forward fill NA values.

DataFrameGroupBy.first([numeric_only, min_count])

Compute first of group values.

DataFrameGroupBy.head([n, preserve_order])

Return first n rows of each group

DataFrameGroupBy.idxmax([numeric_only, ...])

Compute idxmax of group values.

DataFrameGroupBy.idxmin([numeric_only, ...])

Compute idxmin of group values.

DataFrameGroupBy.last([numeric_only, min_count])

Compute last of group values.

DataFrameGroupBy.max([numeric_only, min_count])

Compute max of group values.

DataFrameGroupBy.mean([numeric_only, min_count])

Compute mean of group values.

DataFrameGroupBy.median([numeric_only, ...])

Compute median of group values.

DataFrameGroupBy.min([numeric_only, min_count])

Compute min of group values.

DataFrameGroupBy.ngroup([ascending])

Number each group from 0 to the number of groups - 1.

DataFrameGroupBy.nth(n[, dropna])

Return the nth row from each group.

DataFrameGroupBy.nunique([dropna])

Return number of unique elements in the group.

DataFrameGroupBy.ohlc()

Compute open, high, low and close values of a group, excluding missing values.

DataFrameGroupBy.pct_change([periods, ...])

Calculates the percent change between sequential elements in the group.

DataFrameGroupBy.prod([numeric_only, min_count])

Compute prod of group values.

DataFrameGroupBy.quantile([q, ...])

Compute the column-wise quantiles of the values in each group.

DataFrameGroupBy.rank([method, ascending, ...])

Return the rank of values within each group.

DataFrameGroupBy.resample(rule, *args[, ...])

Provide resampling when using a TimeGrouper.

DataFrameGroupBy.rolling(*args, **kwargs)

Returns a RollingGroupby object that enables rolling window calculations on the groups.

DataFrameGroupBy.sample([n, frac, replace, ...])

Return a random sample of items in each group.

DataFrameGroupBy.shift([periods, freq, ...])

Shift each group by periods positions.

DataFrameGroupBy.size()

Return the size of each group.

DataFrameGroupBy.std([ddof, engine, ...])

Compute the column-wise std of the values in each group.

DataFrameGroupBy.sum([numeric_only, min_count])

Compute sum of group values.

DataFrameGroupBy.var([ddof, engine, ...])

Compute the column-wise variance of the values in each group.

DataFrameGroupBy.tail([n, preserve_order])

Return last n rows of each group

DataFrameGroupBy.take(indices)

Return the elements in the given positional indices in each group.

DataFrameGroupBy.value_counts([subset, ...])

Return a Series or DataFrame containing counts of unique rows.

SeriesGroupBy computations / descriptive stats#

SeriesGroupBy.all([skipna])

Return True if all values in the group are truthful, else False.

SeriesGroupBy.any([skipna])

Return True if any value in the group is truthful, else False.

SeriesGroupBy.bfill([limit])

Backward fill NA values.

SeriesGroupBy.corr(other[, method, min_periods])

SeriesGroupBy.count([dropna])

Compute the number of values in each column.

SeriesGroupBy.cov([min_periods, ddof, ...])

Compute the pairwise covariance among the columns of a DataFrame, excluding NA/null values.

SeriesGroupBy.cumcount([ascending])

Return the cumulative count of keys in each group.

SeriesGroupBy.cummax(*args, **kwargs)

Cumulative max for each group.

SeriesGroupBy.cummin(*args, **kwargs)

Cumulative min for each group.

SeriesGroupBy.cumprod(*args, **kwargs)

Cumulative product for each group.

SeriesGroupBy.cumsum(*args, **kwargs)

Cumulative sum for each group.

SeriesGroupBy.describe([percentiles, ...])

Generate descriptive statistics that summarizes the central tendency, dispersion and shape of a dataset's distribution, excluding NaN values.

SeriesGroupBy.diff([periods, axis])

Get the difference between the values in each group.

SeriesGroupBy.ewm(*args, **kwargs)

Return an ewm grouper, providing ewm functionality per group.

SeriesGroupBy.expanding(*args, **kwargs)

Return an expanding grouper, providing expanding functionality per group.

SeriesGroupBy.ffill([limit])

Forward fill NA values.

SeriesGroupBy.first([numeric_only, min_count])

Compute first of group values.

SeriesGroupBy.head([n, preserve_order])

Return first n rows of each group

SeriesGroupBy.last([numeric_only, min_count])

Compute last of group values.

SeriesGroupBy.idxmax([numeric_only, min_count])

Compute idxmax of group values.

SeriesGroupBy.idxmin([numeric_only, min_count])

Compute idxmin of group values.

SeriesGroupBy.is_monotonic_increasing

Return whether each group's values are monotonically increasing.

SeriesGroupBy.is_monotonic_decreasing

Return whether each group's values are monotonically decreasing.

SeriesGroupBy.max([numeric_only, min_count])

Compute max of group values.

SeriesGroupBy.mean([numeric_only, min_count])

Compute mean of group values.

SeriesGroupBy.median([numeric_only, min_count])

Compute median of group values.

SeriesGroupBy.min([numeric_only, min_count])

Compute min of group values.

SeriesGroupBy.ngroup([ascending])

Number each group from 0 to the number of groups - 1.

SeriesGroupBy.nlargest([n, keep])

Return the largest n elements.

SeriesGroupBy.nsmallest([n, keep])

Return the smallest n elements.

SeriesGroupBy.nth(n[, dropna])

Return the nth row from each group.

SeriesGroupBy.nunique([dropna])

Return number of unique elements in the group.

SeriesGroupBy.unique()

Get a list of the unique values for each column in each group.

SeriesGroupBy.ohlc()

Compute open, high, low and close values of a group, excluding missing values.

SeriesGroupBy.pct_change([periods, ...])

Calculates the percent change between sequential elements in the group.

SeriesGroupBy.prod([numeric_only, min_count])

Compute prod of group values.

SeriesGroupBy.quantile([q, interpolation, ...])

Compute the column-wise quantiles of the values in each group.

SeriesGroupBy.rank([method, ascending, ...])

Return the rank of values within each group.

SeriesGroupBy.resample(rule, *args[, ...])

Provide resampling when using a TimeGrouper.

SeriesGroupBy.rolling(*args, **kwargs)

Returns a RollingGroupby object that enables rolling window calculations on the groups.

SeriesGroupBy.sample([n, frac, replace, ...])

Return a random sample of items in each group.

SeriesGroupBy.shift([periods, freq, axis, ...])

Shift each group by periods positions.

SeriesGroupBy.size()

Return the size of each group.

SeriesGroupBy.std([ddof, engine, ...])

Compute the column-wise std of the values in each group.

SeriesGroupBy.sum([numeric_only, min_count])

Compute sum of group values.

SeriesGroupBy.var([ddof, engine, ...])

Compute the column-wise variance of the values in each group.

SeriesGroupBy.tail([n, preserve_order])

Return last n rows of each group

SeriesGroupBy.take(indices)

Return the elements in the given positional indices in each group.

SeriesGroupBy.value_counts([normalize, ...])

Plotting and visualization#

DataFrameGroupBy.boxplot([subplots, column, ...])

DataFrameGroupBy.hist([column, by, grid, ...])

SeriesGroupBy.hist([by, ax, grid, ...])

DataFrameGroupBy.plot

Make plots of a grouped Series or DataFrame.

SeriesGroupBy.plot

Make plots of a grouped Series or DataFrame.