cudf.core.groupby.groupby.GroupBy.apply#
- GroupBy.apply(function, *args)#
Apply a python transformation function over the grouped chunk.
- Parameters
- funcfunction
The python transformation function that will be applied on the grouped chunk.
Examples
from cudf import DataFrame df = DataFrame() df['key'] = [0, 0, 1, 1, 2, 2, 2] df['val'] = [0, 1, 2, 3, 4, 5, 6] groups = df.groupby(['key']) # Define a function to apply to each row in a group def mult(df): df['out'] = df['key'] * df['val'] return df result = groups.apply(mult) print(result)
Output:
key val out 0 0 0 0 1 0 1 0 2 1 2 2 3 1 3 3 4 2 4 8 5 2 5 10 6 2 6 12
Pandas Compatibility Note
groupby.apply
cuDF’s
groupby.apply
is limited compared to pandas. In some situations, Pandas returns the grouped keys as part of the index while cudf does not due to redundancy. For example:>>> df = pd.DataFrame({ ... 'a': [1, 1, 2, 2], ... 'b': [1, 2, 1, 2], ... 'c': [1, 2, 3, 4], ... }) >>> gdf = cudf.from_pandas(df) >>> df.groupby('a').apply(lambda x: x.iloc[[0]]) a b c a 1 0 1 1 1 2 2 2 1 3 >>> gdf.groupby('a').apply(lambda x: x.iloc[[0]]) a b c 0 1 1 1 2 2 1 3