cudf.core.groupby.groupby.DataFrameGroupBy.aggregate#
- DataFrameGroupBy.aggregate(func=None, *args, engine=None, engine_kwargs=None, **kwargs)[source]#
Apply aggregation(s) to the groups.
- Parameters:
- funcstr, callable, list or dict
Argument specifying the aggregation(s) to perform on the groups. func can be any of the following:
string: the name of a supported aggregation
callable: a function that accepts a Series/DataFrame and performs a supported operation on it.
list: a list of strings/callables specifying the aggregations to perform on every column.
dict: a mapping of column names to string/callable specifying the aggregations to perform on those columns.
- See :ref:`the user guide <basics.groupby>` for supported
- aggregations.
- Returns:
- A Series or DataFrame containing the combined results of the
- aggregation(s).
Examples
>>> import cudf >>> a = cudf.DataFrame({ ... 'a': [1, 1, 2], ... 'b': [1, 2, 3], ... 'c': [2, 2, 1] ... }) >>> a.groupby('a', sort=True).agg('sum') b c a 1 3 4 2 3 1
Specifying a list of aggregations to perform on each column.
>>> import cudf >>> a = cudf.DataFrame({ ... 'a': [1, 1, 2], ... 'b': [1, 2, 3], ... 'c': [2, 2, 1] ... }) >>> a.groupby('a', sort=True).agg(['sum', 'min']) b c sum min sum min a 1 3 1 4 2 2 3 3 1 1
Using a dict to specify aggregations to perform per column.
>>> import cudf >>> a = cudf.DataFrame({ ... 'a': [1, 1, 2], ... 'b': [1, 2, 3], ... 'c': [2, 2, 1] ... }) >>> a.groupby('a', sort=True).agg({'a': 'max', 'b': ['min', 'mean']}) a b max min mean a 1 1 1 1.5 2 2 3 3.0
Using lambdas/callables to specify aggregations taking parameters.
>>> import cudf >>> a = cudf.DataFrame({ ... 'a': [1, 1, 2], ... 'b': [1, 2, 3], ... 'c': [2, 2, 1] ... }) >>> f1 = lambda x: x.quantile(0.5); f1.__name__ = "q0.5" >>> f2 = lambda x: x.quantile(0.75); f2.__name__ = "q0.75" >>> a.groupby('a').agg([f1, f2]) b c q0.5 q0.75 q0.5 q0.75 a 1 1.5 1.75 2.0 2.0 2 3.0 3.00 1.0 1.0