DataFrame#

Constructor#

DataFrame([data, index, columns, dtype, ...])

A GPU Dataframe object.

Attributes and underlying data#

Axes

DataFrame.axes

Return a list representing the axes of the DataFrame.

DataFrame.index

Get the labels for the rows.

DataFrame.columns

Returns a tuple of columns

DataFrame.dtypes

Return the dtypes in this object.

DataFrame.info([verbose, buf, max_cols, ...])

Print a concise summary of a DataFrame.

DataFrame.select_dtypes([include, exclude])

Return a subset of the DataFrame's columns based on the column dtypes.

DataFrame.values

Return a CuPy representation of the DataFrame.

DataFrame.ndim

Dimension of the data.

DataFrame.size

Return the number of elements in the underlying data.

DataFrame.shape

Returns a tuple representing the dimensionality of the DataFrame.

DataFrame.memory_usage([index, deep])

Return the memory usage of an object.

DataFrame.empty

Indicator whether DataFrame or Series is empty.

Conversion#

DataFrame.astype(dtype[, copy, errors])

Cast the object to the given dtype.

DataFrame.convert_dtypes([infer_objects, ...])

Convert columns to the best possible nullable dtypes.

DataFrame.copy([deep])

Make a copy of this object's indices and data.

Indexing, iteration#

DataFrame.head([n])

Return the first n rows.

DataFrame.at

Alias for DataFrame.loc; provided for compatibility with Pandas.

DataFrame.iat

Alias for DataFrame.iloc; provided for compatibility with Pandas.

DataFrame.loc

Select rows and columns by label or boolean mask.

DataFrame.iloc

Select values by position.

DataFrame.insert(loc, name, value[, nan_as_null])

Add a column to DataFrame at the index specified by loc.

DataFrame.__iter__()

DataFrame.items()

Iterate over column names and series pairs

DataFrame.keys()

Get the columns.

DataFrame.iterrows()

Iteration is unsupported.

DataFrame.itertuples([index, name])

Iteration is unsupported.

DataFrame.pop(item)

Return a column and drop it from the DataFrame.

DataFrame.tail([n])

Returns the last n rows as a new DataFrame or Series

DataFrame.isin(values)

Whether each element in the DataFrame is contained in values.

DataFrame.squeeze([axis])

Squeeze 1 dimensional axis objects into scalars.

DataFrame.where(cond[, other, inplace])

Replace values where the condition is False.

DataFrame.mask(cond[, other, inplace])

Replace values where the condition is True.

DataFrame.query(expr[, local_dict])

Query with a boolean expression using Numba to compile a GPU kernel.

Binary operator functions#

DataFrame.add(other[, axis, level, fill_value])

Get Addition of DataFrame or Series and other, element-wise (binary operator add).

DataFrame.sub(other[, axis, level, fill_value])

Get Subtraction of DataFrame or Series and other, element-wise (binary operator sub).

DataFrame.subtract(other[, axis, level, ...])

Get Subtraction of DataFrame or Series and other, element-wise (binary operator sub).

DataFrame.mul(other[, axis, level, fill_value])

Get Multiplication of DataFrame or Series and other, element-wise (binary operator mul).

DataFrame.multiply(other[, axis, level, ...])

Get Multiplication of DataFrame or Series and other, element-wise (binary operator mul).

DataFrame.truediv(other[, axis, level, ...])

Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).

DataFrame.div(other[, axis, level, fill_value])

Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).

DataFrame.divide(other[, axis, level, ...])

Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).

DataFrame.floordiv(other[, axis, level, ...])

Get Integer division of DataFrame or Series and other, element-wise (binary operator floordiv).

DataFrame.mod(other[, axis, level, fill_value])

Get Modulo of DataFrame or Series and other, element-wise (binary operator mod).

DataFrame.pow(other[, axis, level, fill_value])

Get Exponential of DataFrame or Series and other, element-wise (binary operator pow).

DataFrame.dot(other[, reflect])

Get dot product of frame and other, (binary operator dot).

DataFrame.radd(other[, axis, level, fill_value])

Get Addition of DataFrame or Series and other, element-wise (binary operator radd).

DataFrame.rsub(other[, axis, level, fill_value])

Get Subtraction of DataFrame or Series and other, element-wise (binary operator rsub).

DataFrame.rmul(other[, axis, level, fill_value])

Get Multiplication of DataFrame or Series and other, element-wise (binary operator rmul).

DataFrame.rdiv(other[, axis, level, fill_value])

Get Floating division of DataFrame or Series and other, element-wise (binary operator rtruediv).

DataFrame.rtruediv(other[, axis, level, ...])

Get Floating division of DataFrame or Series and other, element-wise (binary operator rtruediv).

DataFrame.rfloordiv(other[, axis, level, ...])

Get Integer division of DataFrame or Series and other, element-wise (binary operator rfloordiv).

DataFrame.rmod(other[, axis, level, fill_value])

Get Modulo of DataFrame or Series and other, element-wise (binary operator rmod).

DataFrame.rpow(other[, axis, level, fill_value])

Get Exponential of DataFrame or Series and other, element-wise (binary operator rpow).

DataFrame.round([decimals, how])

Round to a variable number of decimal places.

DataFrame.lt(other[, axis, level, fill_value])

Get Less than of DataFrame or Series and other, element-wise (binary operator lt).

DataFrame.gt(other[, axis, level, fill_value])

Get Greater than of DataFrame or Series and other, element-wise (binary operator gt).

DataFrame.le(other[, axis, level, fill_value])

Get Less than or equal to of DataFrame or Series and other, element-wise (binary operator le).

DataFrame.ge(other[, axis, level, fill_value])

Get Greater than or equal to of DataFrame or Series and other, element-wise (binary operator ge).

DataFrame.ne(other[, axis, level, fill_value])

Get Not equal to of DataFrame or Series and other, element-wise (binary operator ne).

DataFrame.eq(other[, axis, level, fill_value])

Get Equal to of DataFrame or Series and other, element-wise (binary operator eq).

DataFrame.product([axis, skipna, dtype, ...])

Return product of the values in the DataFrame.

Function application, GroupBy & window#

DataFrame.agg(aggs[, axis])

Aggregate using one or more operations over the specified axis.

DataFrame.apply(func[, axis, raw, ...])

Apply a function along an axis of the DataFrame.

DataFrame.applymap(func[, na_action])

Apply a function to a Dataframe elementwise.

DataFrame.apply_chunks(func, incols, outcols)

Transform user-specified chunks using the user-provided function.

DataFrame.apply_rows(func, incols, outcols, ...)

Apply a row-wise user defined function.

DataFrame.groupby([by, axis, level, ...])

Group using a mapper or by a Series of columns.

DataFrame.map(func[, na_action])

Apply a function to a Dataframe elementwise.

DataFrame.pipe(func, *args, **kwargs)

Apply func(self, *args, **kwargs).

DataFrame.rolling(window[, min_periods, ...])

Rolling window calculations.

Computations / descriptive stats#

DataFrame.abs()

Return a Series/DataFrame with absolute numeric value of each element.

DataFrame.all([axis, bool_only, skipna])

Return whether all elements are True in DataFrame.

DataFrame.any([axis, bool_only, skipna])

Return whether any elements is True in DataFrame.

DataFrame.clip([lower, upper, inplace, axis])

Trim values at input threshold(s).

DataFrame.corr([method, min_periods])

Compute the correlation matrix of a DataFrame.

DataFrame.count([axis, numeric_only])

Count non-NA cells for each column or row.

DataFrame.cov(**kwargs)

Compute the covariance matrix of a DataFrame.

DataFrame.cummax([axis])

Return cumulative max of the IndexedFrame.

DataFrame.cummin([axis])

Return cumulative min of the IndexedFrame.

DataFrame.cumprod([axis])

Return cumulative product of the IndexedFrame.

DataFrame.cumsum([axis])

Return cumulative sum of the IndexedFrame.

DataFrame.describe([percentiles, include, ...])

Generate descriptive statistics.

DataFrame.diff([periods, axis])

First discrete difference of element.

DataFrame.eval(expr[, inplace])

Evaluate a string describing operations on DataFrame columns.

DataFrame.kurt([axis, skipna, numeric_only])

Return Fisher's unbiased kurtosis of a sample.

DataFrame.kurtosis([axis, skipna, numeric_only])

Return Fisher's unbiased kurtosis of a sample.

DataFrame.max([axis, skipna, numeric_only])

Return the maximum of the values in the DataFrame.

DataFrame.mean([axis, skipna, numeric_only])

Return the mean of the values for the requested axis.

DataFrame.median([axis, skipna, level, ...])

Return the median of the values for the requested axis.

DataFrame.min([axis, skipna, numeric_only])

Return the minimum of the values in the DataFrame.

DataFrame.mode([axis, numeric_only, dropna])

Get the mode(s) of each element along the selected axis.

DataFrame.pct_change([periods, fill_method, ...])

Calculates the percent change between sequential elements in the DataFrame.

DataFrame.prod([axis, skipna, dtype, ...])

Return product of the values in the DataFrame.

DataFrame.product([axis, skipna, dtype, ...])

Return product of the values in the DataFrame.

DataFrame.quantile([q, axis, numeric_only, ...])

Return values at the given quantile.

DataFrame.rank([axis, method, numeric_only, ...])

Compute numerical data ranks (1 through n) along axis.

DataFrame.round([decimals, how])

Round to a variable number of decimal places.

DataFrame.scale()

Scale values to [0, 1] in float64

DataFrame.skew([axis, skipna, numeric_only])

Return unbiased Fisher-Pearson skew of a sample.

DataFrame.sum([axis, skipna, dtype, ...])

Return sum of the values in the DataFrame.

DataFrame.std([axis, skipna, ddof, numeric_only])

Return sample standard deviation of the DataFrame.

DataFrame.var([axis, skipna, ddof, numeric_only])

Return unbiased variance of the DataFrame.

DataFrame.nunique([axis, dropna])

Count number of distinct elements in specified axis.

DataFrame.value_counts([subset, normalize, ...])

Return a Series containing counts of unique rows in the DataFrame.

Reindexing / selection / label manipulation#

DataFrame.add_prefix(prefix)

Prefix labels with string prefix.

DataFrame.add_suffix(suffix)

Suffix labels with string suffix.

DataFrame.drop([labels, axis, index, ...])

Drop specified labels from rows or columns.

DataFrame.drop_duplicates([subset, keep, ...])

Return DataFrame with duplicate rows removed.

DataFrame.duplicated([subset, keep])

Return boolean Series denoting duplicate rows.

DataFrame.equals(other)

Test whether two objects contain the same elements.

DataFrame.first(offset)

Select initial periods of time series data based on a date offset.

DataFrame.head([n])

Return the first n rows.

DataFrame.last(offset)

Select final periods of time series data based on a date offset.

DataFrame.reindex([labels, index, columns, ...])

Conform DataFrame to new index.

DataFrame.rename([mapper, index, columns, ...])

Alter column and index labels.

DataFrame.reset_index([level, drop, ...])

Reset the index of the DataFrame, or a level of it.

DataFrame.sample([n, frac, replace, ...])

Return a random sample of items from an axis of object.

DataFrame.searchsorted(values[, side, ...])

Find indices where elements should be inserted to maintain order

DataFrame.set_index(keys[, drop, append, ...])

Return a new DataFrame with a new index

DataFrame.repeat(repeats[, axis])

Repeats elements consecutively.

DataFrame.tail([n])

Returns the last n rows as a new DataFrame or Series

DataFrame.take(indices[, axis])

Return a new frame containing the rows specified by indices.

DataFrame.tile(count)

Repeats the rows count times to form a new Frame.

DataFrame.truncate([before, after, axis, copy])

Truncate a Series or DataFrame before and after some index value.

Missing data handling#

DataFrame.backfill([value, axis, inplace, limit])

Synonym for Series.fillna() with method='bfill'.

DataFrame.bfill([value, axis, inplace, limit])

Synonym for Series.fillna() with method='bfill'.

DataFrame.dropna([axis, how, thresh, ...])

Drop rows (or columns) containing nulls from a Column.

DataFrame.ffill([value, axis, inplace, limit])

Synonym for Series.fillna() with method='ffill'.

DataFrame.fillna([value, method, axis, ...])

Fill null values with value or specified method.

DataFrame.interpolate([method, axis, limit, ...])

Interpolate data values between some points.

DataFrame.isna()

Identify missing values.

DataFrame.isnull()

Identify missing values.

DataFrame.nans_to_nulls()

Convert nans (if any) to nulls

DataFrame.notna()

Identify non-missing values.

DataFrame.notnull()

Identify non-missing values.

DataFrame.pad([value, axis, inplace, limit])

Synonym for Series.fillna() with method='ffill'.

DataFrame.replace([to_replace, value, ...])

Replace values given in to_replace with value.

Reshaping, sorting, transposing#

DataFrame.argsort([by, axis, kind, order, ...])

Return the integer indices that would sort the Series values.

DataFrame.interleave_columns()

Interleave Series columns of a table into a single column.

DataFrame.partition_by_hash(columns, nparts)

Partition the dataframe by the hashed value of data in columns.

DataFrame.pivot(*, columns[, index, values])

Return reshaped DataFrame organized by the given index and column values.

DataFrame.pivot_table([values, index, ...])

Create a spreadsheet-style pivot table as a DataFrame.

DataFrame.scatter_by_map(map_index[, ...])

Scatter to a list of dataframes.

DataFrame.sort_values(by[, axis, ascending, ...])

Sort by the values along either axis.

DataFrame.sort_index([axis, level, ...])

Sort object by labels (along an axis).

DataFrame.nlargest(n, columns[, keep])

Return the first n rows ordered by columns in descending order.

DataFrame.nsmallest(n, columns[, keep])

Return the first n rows ordered by columns in ascending order.

DataFrame.swaplevel([i, j, axis])

Swap level i with level j.

DataFrame.stack([level, dropna, future_stack])

Stack the prescribed level(s) from columns to index

DataFrame.unstack([level, fill_value])

Pivot one or more levels of the (necessarily hierarchical) index labels.

DataFrame.melt(**kwargs)

Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.

DataFrame.explode(column[, ignore_index])

Transform each element of a list-like to a row, replicating index values.

DataFrame.to_struct([name])

Return a struct Series composed of the columns of the DataFrame.

DataFrame.T

Transpose index and columns.

DataFrame.transpose()

Transpose index and columns.

Combining / comparing / joining / merging#

DataFrame.assign(**kwargs)

Assign columns to DataFrame from keyword arguments.

DataFrame.join(other[, on, how, lsuffix, ...])

Join columns with other DataFrame on index or on a key column.

DataFrame.merge(right[, on, left_on, ...])

Merge GPU DataFrame objects by performing a database-style join operation by columns or indexes.

DataFrame.update(other[, join, overwrite, ...])

Modify a DataFrame in place using non-NA values from another DataFrame.

Serialization / IO / conversion#

DataFrame.deserialize(header, frames)

Generate an object from a serialized representation.

DataFrame.device_deserialize(header, frames)

Perform device-side deserialization tasks.

DataFrame.device_serialize()

Serialize data and metadata associated with device memory.

DataFrame.from_arrow(table)

Convert from PyArrow Table to DataFrame.

DataFrame.from_dict(data[, orient, dtype, ...])

Construct DataFrame from dict of array-like or dicts.

DataFrame.from_pandas(dataframe[, nan_as_null])

Convert from a Pandas DataFrame.

DataFrame.from_records(data[, index, ...])

Convert structured or record ndarray to DataFrame.

DataFrame.hash_values([method, seed])

Compute the hash of values in this column.

DataFrame.host_deserialize(header, frames)

Perform device-side deserialization tasks.

DataFrame.host_serialize()

Serialize data and metadata associated with host memory.

DataFrame.serialize()

Generate an equivalent serializable representation of an object.

DataFrame.to_arrow([preserve_index])

Convert to a PyArrow Table.

DataFrame.to_dict([orient, into])

Convert the DataFrame to a dictionary.

DataFrame.to_dlpack()

Converts a cuDF object into a DLPack tensor.

DataFrame.to_parquet(path[, engine, ...])

Write a DataFrame to the parquet format.

DataFrame.to_csv([path_or_buf, sep, na_rep, ...])

Write a dataframe to csv file format.

DataFrame.to_cupy([dtype, copy, na_value])

Convert the Frame to a CuPy array.

DataFrame.to_hdf(path_or_buf, key, *args, ...)

Write the contained data to an HDF5 file using HDFStore.

DataFrame.to_dict([orient, into])

Convert the DataFrame to a dictionary.

DataFrame.to_json([path_or_buf])

Convert the cuDF object to a JSON string.

DataFrame.to_numpy([dtype, copy, na_value])

Convert the Frame to a NumPy array.

DataFrame.to_pandas(*[, nullable, arrow_type])

Convert to a Pandas DataFrame.

DataFrame.to_feather(path, *args, **kwargs)

Write a DataFrame to the feather format.

DataFrame.to_records([index])

Convert to a numpy recarray

DataFrame.to_string()

Convert to string

DataFrame.values

Return a CuPy representation of the DataFrame.

DataFrame.values_host

Return a NumPy representation of the data.