cudf.DataFrame.query#

DataFrame.query(expr: str, local_dict: None | dict[str, Any] = None, global_dict: None | dict[str, Any] = None, **kwargs) DataFrame[source]#

Query with a boolean expression using Numba to compile a GPU kernel.

See pandas.DataFrame.query().

Parameters:
exprstr

A boolean expression. Names in expression refer to columns. index can be used instead of index name, but this is not supported for MultiIndex. Names starting with @ refer to Python variables. An output value will be null if any of the input values are null regardless of expression.

local_dictdict

Containing the local variable to be used in query.

global_dictdict, optional

A dictionary of global variables. If not provided, the globals from the calling environment are used.

**kwargs

Not supported.

Returns:
filteredDataFrame

Examples

>>> df = cudf.DataFrame({
...     "a": [1, 2, 2],
...     "b": [3, 4, 5],
... })
>>> expr = "(a == 2 and b == 4) or (b == 3)"
>>> df.query(expr)
   a  b
0  1  3
1  2  4

DateTime conditionals:

>>> import numpy as np
>>> import datetime
>>> df = cudf.DataFrame()
>>> data = np.array(['2018-10-07', '2018-10-08'], dtype='datetime64')
>>> df['datetimes'] = data
>>> search_date = datetime.datetime.strptime('2018-10-08', '%Y-%m-%d')
>>> df.query('datetimes==@search_date')
   datetimes
1 2018-10-08

Using local_dict:

>>> import numpy as np
>>> import datetime
>>> df = cudf.DataFrame()
>>> data = np.array(['2018-10-07', '2018-10-08'], dtype='datetime64')
>>> df['datetimes'] = data
>>> search_date2 = datetime.datetime.strptime('2018-10-08', '%Y-%m-%d')
>>> df.query('datetimes==@search_date',
...          local_dict={'search_date': search_date2})
   datetimes
1 2018-10-08

Pandas Compatibility Note

pandas.DataFrame.query()

One difference from pandas is that query currently only supports numeric, datetime, timedelta, or bool dtypes.