Pandas Compatibility Notes#
Pandas Compatibility Note
MultiIndex.get_loc
The return types of this function may deviates from the method provided by Pandas. If the index is neither lexicographically sorted nor unique, a best effort attempt is made to coerce the found indices into a slice. For example:
>>> import pandas as pd
>>> import cudf
>>> x = pd.MultiIndex.from_tuples([
... (2, 1, 1), (1, 2, 3), (1, 2, 1),
... (1, 1, 1), (1, 1, 1), (2, 2, 1),
... ])
>>> x.get_loc(1)
array([False, True, True, True, True, False])
>>> cudf.from_pandas(x).get_loc(1)
slice(1, 5, 1)
Pandas Compatibility Note
groupby.fillna
This function may return result in different format to the method Pandas supports. For example:
>>> df = pd.DataFrame({'k': [1, 1, 2], 'v': [2, None, 4]})
>>> gdf = cudf.from_pandas(df)
>>> df.groupby('k').fillna({'v': 4}) # pandas
v
k
1 0 2.0
1 4.0
2 2 4.0
>>> gdf.groupby('k').fillna({'v': 4}) # cudf
v
0 2.0
1 4.0
2 4.0
Pandas Compatibility Note
groupby.apply
cuDF's groupby.apply
is limited compared to pandas.
In some situations, Pandas returns the grouped keys as part of
the index while cudf does not due to redundancy. For example:
>>> df = pd.DataFrame({
'a': [1, 1, 2, 2],
'b': [1, 2, 1, 2],
'c': [1, 2, 3, 4]})
>>> gdf = cudf.from_pandas(df)
>>> df.groupby('a').apply(lambda x: x.iloc[[0]])
a b c
a
1 0 1 1 1
2 2 2 1 3
>>> gdf.groupby('a').apply(lambda x: x.iloc[[0]])
a b c
0 1 1 1
2 2 1 3