DataFrame

Constructor

DataFrame([data, index, columns, dtype])

A GPU Dataframe object.

Attributes and underlying data

Axes

DataFrame.index

Returns the index of the DataFrame

DataFrame.columns

Returns a tuple of columns

DataFrame.dtypes

Return the dtypes in this object.

DataFrame.info([verbose, buf, max_cols, ...])

Print a concise summary of a DataFrame.

DataFrame.select_dtypes([include, exclude])

Return a subset of the DataFrame’s columns based on the column dtypes.

DataFrame.values

Return a CuPy representation of the DataFrame.

DataFrame.ndim

Dimension of the data.

DataFrame.size

Return the number of elements in the underlying data.

DataFrame.shape

Returns a tuple representing the dimensionality of the DataFrame.

DataFrame.memory_usage([index, deep])

Return the memory usage of each column in bytes.

DataFrame.empty

Indicator whether DataFrame or Series is empty.

Conversion

DataFrame.astype(dtype[, copy, errors])

Cast the DataFrame to the given dtype

DataFrame.copy([deep])

Make a copy of this object's indices and data.

Indexing, iteration

DataFrame.head([n])

Return the first n rows.

DataFrame.at

Alias for DataFrame.loc; provided for compatibility with Pandas.

DataFrame.iat

Alias for DataFrame.iloc; provided for compatibility with Pandas.

DataFrame.loc

Selecting rows and columns by label or boolean mask.

DataFrame.iloc

Selecting rows and column by position.

DataFrame.insert(loc, name, value)

Add a column to DataFrame at the index specified by loc.

DataFrame.__iter__()

DataFrame.iteritems()

Iterate over column names and series pairs

DataFrame.keys()

Get the columns.

DataFrame.iterrows()

DataFrame.itertuples([index, name])

DataFrame.pop(item)

Return a column and drop it from the DataFrame.

DataFrame.tail([n])

Returns the last n rows as a new DataFrame or Series

DataFrame.isin(values)

Whether each element in the DataFrame is contained in values.

DataFrame.where(cond[, other, inplace])

Replace values where the condition is False.

DataFrame.mask(cond[, other, inplace])

Replace values where the condition is True.

DataFrame.query(expr[, local_dict])

Query with a boolean expression using Numba to compile a GPU kernel.

For more information on .at, .iat, .loc, and .iloc, see the indexing documentation.

Binary operator functions

DataFrame.add(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator add).

DataFrame.sub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator sub).

DataFrame.mul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

DataFrame.div(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

DataFrame.truediv(other[, axis, level, ...])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

DataFrame.floordiv(other[, axis, level, ...])

Get Integer division of dataframe and other, element-wise (binary operator floordiv).

DataFrame.mod(other[, axis, level, fill_value])

Get Modulo division of dataframe and other, element-wise (binary operator mod).

DataFrame.pow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator pow).

DataFrame.radd(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator radd).

DataFrame.rsub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator rsub).

DataFrame.rmul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator rmul).

DataFrame.rdiv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

DataFrame.rtruediv(other[, axis, level, ...])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

DataFrame.rfloordiv(other[, axis, level, ...])

Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).

DataFrame.rmod(other[, axis, level, fill_value])

Get Modulo division of dataframe and other, element-wise (binary operator rmod).

DataFrame.rpow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator pow).

Function application, GroupBy & window

DataFrame.apply(func[, axis, raw, ...])

Apply a function along an axis of the DataFrame.

DataFrame.apply_chunks(func, incols, outcols)

Transform user-specified chunks using the user-provided function.

DataFrame.apply_rows(func, incols, outcols, ...)

Apply a row-wise user defined function.

DataFrame.pipe(func, *args, **kwargs)

Apply func(self, *args, **kwargs).

DataFrame.agg(aggs[, axis])

Aggregate using one or more operations over the specified axis.

DataFrame.groupby([by, axis, level, ...])

Group DataFrame using a mapper or by a Series of columns.

DataFrame.rolling(window[, min_periods, ...])

Rolling window calculations.

Computations / descriptive stats

DataFrame.all([axis, bool_only, skipna, level])

Return whether all elements are True in DataFrame.

DataFrame.any([axis, bool_only, skipna, level])

Return whether any elements is True in DataFrame.

DataFrame.clip([lower, upper, inplace, axis])

Trim values at input threshold(s).

DataFrame.corr()

Compute the correlation matrix of a DataFrame.

DataFrame.count([axis, level, numeric_only])

Count non-NA cells for each column or row.

DataFrame.cov(**kwargs)

Compute the covariance matrix of a DataFrame.

DataFrame.cummax([axis, skipna])

Return cumulative maximum of the Series or DataFrame.

DataFrame.cummin([axis, skipna])

Return cumulative minimum of the Series or DataFrame.

DataFrame.cumprod([axis, skipna])

Return cumulative product of the Series or DataFrame.

DataFrame.cumsum([axis, skipna])

Return cumulative sum of the Series or DataFrame.

DataFrame.describe([percentiles, include, ...])

Generate descriptive statistics.

DataFrame.kurt([axis, skipna, level, ...])

Return Fisher's unbiased kurtosis of a sample.

DataFrame.kurtosis([axis, skipna, level, ...])

Return Fisher's unbiased kurtosis of a sample.

DataFrame.max([axis, skipna, level, ...])

Return the maximum of the values in the DataFrame.

DataFrame.mean([axis, skipna, level, ...])

Return the mean of the values for the requested axis.

DataFrame.min([axis, skipna, level, ...])

Return the minimum of the values in the DataFrame.

DataFrame.mode([axis, numeric_only, dropna])

Get the mode(s) of each element along the selected axis.

DataFrame.prod([axis, skipna, dtype, level, ...])

Return product of the values in the DataFrame.

DataFrame.product([axis, skipna, dtype, ...])

Return product of the values in the DataFrame.

DataFrame.quantile([q, axis, numeric_only, ...])

Return values at the given quantile.

DataFrame.quantiles([q, interpolation])

Return values at the given quantile.

DataFrame.rank([axis, method, numeric_only, ...])

Compute numerical data ranks (1 through n) along axis.

DataFrame.round([decimals, how])

Round a DataFrame to a variable number of decimal places.

DataFrame.skew([axis, skipna, level, ...])

Return unbiased Fisher-Pearson skew of a sample.

DataFrame.sum([axis, skipna, dtype, level, ...])

Return sum of the values in the DataFrame.

DataFrame.std([axis, skipna, level, ddof, ...])

Return sample standard deviation of the DataFrame.

DataFrame.var([axis, skipna, level, ddof, ...])

Return unbiased variance of the DataFrame.

Reindexing / selection / label manipulation

DataFrame.drop([labels, axis, index, ...])

Drop specified labels from rows or columns.

DataFrame.drop_duplicates([subset, keep, ...])

Return DataFrame with duplicate rows removed, optionally only considering certain subset of columns.

DataFrame.equals(other)

Test whether two objects contain the same elements.

DataFrame.head([n])

Return the first n rows.

DataFrame.reindex([labels, axis, index, ...])

Return a new DataFrame whose axes conform to a new index

DataFrame.rename([mapper, index, columns, ...])

Alter column and index labels.

DataFrame.reset_index([level, drop, ...])

Reset the index.

DataFrame.sample([n, frac, replace, ...])

Return a random sample of items from an axis of object.

DataFrame.searchsorted(values[, side, ...])

Find indices where elements should be inserted to maintain order

DataFrame.set_index(keys[, drop, append, ...])

Return a new DataFrame with a new index

DataFrame.repeat(repeats[, axis])

Repeats elements consecutively.

DataFrame.tail([n])

Returns the last n rows as a new DataFrame or Series

DataFrame.take(positions[, keep_index])

Return a new DataFrame containing the rows specified by positions

DataFrame.tile(count)

Repeats the rows from self DataFrame count times to form a new DataFrame.

Missing data handling

DataFrame.dropna([axis, how, thresh, ...])

Drops rows (or columns) containing nulls from a Column.

DataFrame.fillna([value, method, axis, ...])

Fill null values with value or specified method.

DataFrame.isna()

Identify missing values.

DataFrame.isnull()

Identify missing values.

DataFrame.nans_to_nulls()

Convert nans (if any) to nulls

DataFrame.notna()

Identify non-missing values.

DataFrame.notnull()

Identify non-missing values.

DataFrame.replace([to_replace, value, ...])

Replace values given in to_replace with replacement.

Reshaping, sorting, transposing

DataFrame.argsort([ascending, na_position])

Sort by the values.

DataFrame.interleave_columns()

Interleave Series columns of a table into a single column.

DataFrame.partition_by_hash(columns, nparts)

Partition the dataframe by the hashed value of data in columns.

DataFrame.pivot(index, columns[, values])

Return reshaped DataFrame organized by the given index and column values.

DataFrame.scatter_by_map(map_index[, ...])

Scatter to a list of dataframes.

DataFrame.sort_values(by[, axis, ascending, ...])

Sort by the values row-wise.

DataFrame.sort_index([axis, level, ...])

Sort object by labels (along an axis).

DataFrame.nlargest(n, columns[, keep])

Get the rows of the DataFrame sorted by the n largest value of columns

DataFrame.nsmallest(n, columns[, keep])

Get the rows of the DataFrame sorted by the n smallest value of columns

DataFrame.stack([level, dropna])

Stack the prescribed level(s) from columns to index

DataFrame.unstack([level, fill_value])

Pivot one or more levels of the (necessarily hierarchical) index labels.

DataFrame.melt(**kwargs)

Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.

DataFrame.explode(column[, ignore_index])

Transform each element of a list-like to a row, replicating index values.

DataFrame.T

Transpose index and columns.

DataFrame.transpose()

Transpose index and columns.

Combining / comparing / joining / merging / encoding

DataFrame.append(other[, ignore_index, ...])

Append rows of other to the end of caller, returning a new object.

DataFrame.assign(**kwargs)

Assign columns to DataFrame from keyword arguments.

DataFrame.join(other[, on, how, lsuffix, ...])

Join columns with other DataFrame on index or on a key column.

DataFrame.merge(right[, on, left_on, ...])

Merge GPU DataFrame objects by performing a database-style join operation by columns or indexes.

DataFrame.update(other[, join, overwrite, ...])

Modify a DataFrame in place using non-NA values from another DataFrame.

DataFrame.label_encoding(column, prefix, cats)

Encode labels in a column with label encoding.

DataFrame.one_hot_encoding(column, prefix, cats)

Expand a column with one-hot-encoding.

Numerical operations

DataFrame.acos()

Get Trigonometric inverse cosine, element-wise.

DataFrame.asin()

Get Trigonometric inverse sine, element-wise.

DataFrame.atan()

Get Trigonometric inverse tangent, element-wise.

DataFrame.cos()

Get Trigonometric cosine, element-wise.

DataFrame.exp()

Get the exponential of all elements, element-wise.

DataFrame.log()

Get the natural logarithm of all elements, element-wise.

DataFrame.sin()

Get Trigonometric sine, element-wise.

DataFrame.sqrt()

Get the non-negative square-root of all elements, element-wise.

DataFrame.tan()

Get Trigonometric tangent, element-wise.

Serialization / IO / conversion

DataFrame.as_gpu_matrix([columns, order])

Convert to a matrix in device memory.

DataFrame.as_matrix([columns])

Convert to a matrix in host memory.

DataFrame.from_arrow(table)

Convert from PyArrow Table to DataFrame.

DataFrame.from_pandas(dataframe[, nan_as_null])

Convert from a Pandas DataFrame.

DataFrame.from_records(data[, index, ...])

Convert structured or record ndarray to DataFrame.

DataFrame.hash_columns([columns])

Hash the given columns and return a new device array

DataFrame.to_arrow([preserve_index])

Convert to a PyArrow Table.

DataFrame.to_dlpack()

Converts a cuDF object into a DLPack tensor.

DataFrame.to_parquet(path, *args, **kwargs)

Write a DataFrame to the parquet format.

DataFrame.to_csv([path_or_buf, sep, na_rep, ...])

Write a dataframe to csv file format.

DataFrame.to_hdf(path_or_buf, key, *args, ...)

Write the contained data to an HDF5 file using HDFStore.

DataFrame.to_dict([orient, into])

DataFrame.to_json([path_or_buf])

Convert the cuDF object to a JSON string.

DataFrame.to_pandas([nullable])

Convert to a Pandas DataFrame.

DataFrame.to_feather(path, *args, **kwargs)

Write a DataFrame to the feather format.

DataFrame.to_records([index])

Convert to a numpy recarray

DataFrame.to_string()

Convert to string