cudf.Series#

class cudf.Series(data=None, index=None, dtype=None, name=None, copy=False, nan_as_null=True)#

One-dimensional GPU array (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as null/NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values, they need not be the same length. The result index will be the sorted union of the two indexes.

Series objects are used as columns of DataFrame.

Parameters:
dataarray-like, Iterable, dict, or scalar value

Contains data stored in Series.

indexarray-like or Index (1d)

Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If both a dict and index sequence are used, the index will override the keys found in the dict.

dtypestr, numpy.dtype, or ExtensionDtype, optional

Data type for the output Series. If not specified, this will be inferred from data.

namestr, optional

The name to give to the Series.

copybool, default False

Copy input data. Only affects Series or 1d ndarray input.

nan_as_nullbool, Default True

If None/True, converts np.nan values to null values. If False, leaves np.nan values as is.

Attributes

T

Return the transpose, which is by definition self.

axes

Return a list representing the axes of the Series.

cat

Accessor object for categorical properties of the Series values.

data

The gpu buffer for the data

dt

Accessor object for datetime-like properties of the Series values.

dtype

The dtype of the Series.

empty

Indicator whether DataFrame or Series is empty.

has_nulls

Indicator whether Series contains null values.

hasnans

Return True if there are any NaNs or nulls.

index

Get the labels for the rows.

is_monotonic_decreasing

Return boolean if values in the object are monotonically decreasing.

is_monotonic_increasing

Return boolean if values in the object are monotonically increasing.

is_unique

Return boolean if values in the object are unique.

list

List methods for Series

name

Get the name of this object.

ndim

Number of dimensions of the underlying data, by definition 1.

null_count

Number of null values

nullable

A boolean indicating whether a null-mask is needed

nullmask

The gpu buffer for the null-mask

shape

Get a tuple representing the dimensionality of the Index.

size

Return the number of elements in the underlying data.

str

Vectorized string functions for Series and Index.

struct

Struct methods for Series

valid_count

Number of non-null values

values

Return a CuPy representation of the DataFrame.

values_host

Return a NumPy representation of the data.

iloc

Select values by position. Examples ——– Series >>> import cudf >>> s = cudf.Series([10, 20, 30]) >>> s 0 10 1 20 2 30 dtype: int64 >>> s.iloc[2] 30 DataFrame Selecting rows and column by position. >>> df = cudf.DataFrame({‘a’: range(20), … ‘b’: range(20), … ‘c’: range(20)}) Select a single row using an integer index. >>> df.iloc[1] a 1 b 1 c 1 Name: 1, dtype: int64 Select multiple rows using a list of integers. >>> df.iloc[[0, 2, 9, 18]] a b c 0 0 0 0 2 2 2 2 9 9 9 9 18 18 18 18 Select rows using a slice. >>> df.iloc[3:10:2] a b c 3 3 3 3 5 5 5 5 7 7 7 7 9 9 9 9 Select both rows and columns. >>> df.iloc[[1, 3, 5, 7], 2] 1 1 3 3 5 5 7 7 Name: c, dtype: int64 Setting values in a column using iloc. >>> df.iloc[:4] = 0 >>> df a b c 0 0 0 0 1 0 0 0 2 0 0 0 3 0 0 0 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8 9 9 9 9 [10 more rows]

loc

Select rows and columns by label or boolean mask. Examples ——– Series >>> import cudf >>> series = cudf.Series([10, 11, 12], index=[‘a’, ‘b’, ‘c’]) >>> series a 10 b 11 c 12 dtype: int64 >>> series.loc[‘b’] 11 DataFrame DataFrame with string index. >>> df a b a 0 5 b 1 6 c 2 7 d 3 8 e 4 9 Select a single row by label. >>> df.loc[‘a’] a 0 b 5 Name: a, dtype: int64 Select multiple rows and a single column. >>> df.loc[[‘a’, ‘c’, ‘e’], ‘b’] a 5 c 7 e 9 Name: b, dtype: int64 Selection by boolean mask. >>> df.loc[df.a > 2] a b d 3 8 e 4 9 Setting values using loc. >>> df.loc[[‘a’, ‘c’, ‘e’], ‘a’] = 0 >>> df a b a 0 5 b 1 6 c 0 7 d 3 8 e 0 9

Methods

abs()

Return a Series/DataFrame with absolute numeric value of each element.

add(other[, level, fill_value, axis])

Get Addition of DataFrame or Series and other, element-wise (binary operator add).

add_prefix(prefix)

Prefix labels with string prefix.

add_suffix(suffix)

Suffix labels with string suffix.

all([axis, bool_only, skipna])

Return whether all elements are True in DataFrame.

any([axis, bool_only, skipna])

Return whether any elements is True in DataFrame.

apply(func[, convert_dtype, args])

Apply a scalar function to the values of a Series.

argsort([axis, kind, order, ascending, ...])

Return the integer indices that would sort the Series values.

astype(dtype[, copy, errors])

Cast the object to the given dtype.

autocorr([lag])

Compute the lag-N autocorrelation.

backfill([value, axis, inplace, limit])

Synonym for Series.fillna() with method='bfill'.

between(left, right[, inclusive])

Return boolean Series equivalent to left <= series <= right.

bfill([value, axis, inplace, limit])

Synonym for Series.fillna() with method='bfill'.

clip([lower, upper, inplace, axis])

Trim values at input threshold(s).

convert_dtypes([infer_objects, ...])

Convert columns to the best possible nullable dtypes.

copy([deep])

Make a copy of this object's indices and data.

corr(other[, method, min_periods])

Calculates the sample correlation between two Series, excluding missing values.

count()

Return number of non-NA/null observations in the Series

cov(other[, min_periods])

Compute covariance with Series, excluding missing values.

cummax([axis, skipna])

Return cumulative max of the Series.

cummin([axis, skipna])

Return cumulative min of the Series.

cumprod([axis, skipna])

Return cumulative product of the Series.

cumsum([axis, skipna])

Return cumulative sum of the Series.

describe([percentiles, include, exclude])

Generate descriptive statistics.

deserialize(header, frames)

Generate an object from a serialized representation.

device_deserialize(header, frames)

Perform device-side deserialization tasks.

device_serialize()

Serialize data and metadata associated with device memory.

diff([periods])

First discrete difference of element.

digitize(bins[, right])

Return the indices of the bins to which each value belongs.

div(other[, level, fill_value, axis])

Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).

divide(other[, level, fill_value, axis])

Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).

dot(other[, reflect])

Get dot product of frame and other, (binary operator dot).

drop([labels, axis, index, columns, level, ...])

Drop specified labels from rows or columns.

drop_duplicates([keep, inplace, ignore_index])

Return Series with duplicate values removed.

dropna([axis, inplace, how])

Return a Series with null values removed.

duplicated([keep])

Indicate duplicate Series values.

eq(other[, level, fill_value, axis])

Get Equal to of DataFrame or Series and other, element-wise (binary operator eq).

equals(other)

Test whether two objects contain the same elements.

explode([ignore_index])

Transform each element of a list-like to a row, replicating index values.

factorize([sort, use_na_sentinel])

Encode the input values as integer labels.

ffill([value, axis, inplace, limit])

Synonym for Series.fillna() with method='ffill'.

fillna([value, method, axis, inplace, limit])

Fill null values with value or specified method.

first(offset)

Select initial periods of time series data based on a date offset.

floordiv(other[, level, fill_value, axis])

Get Integer division of DataFrame or Series and other, element-wise (binary operator floordiv).

from_arrow(array)

Create from PyArrow Array/ChunkedArray.

from_categorical(categorical[, codes])

Creates from a pandas.Categorical

from_masked_array(data, mask[, null_count])

Create a Series with null-mask.

from_pandas(s[, nan_as_null])

Convert from a Pandas Series.

ge(other[, level, fill_value, axis])

Get Greater than or equal to of DataFrame or Series and other, element-wise (binary operator ge).

groupby([by, axis, level, as_index, sort, ...])

Group using a mapper or by a Series of columns.

gt(other[, level, fill_value, axis])

Get Greater than of DataFrame or Series and other, element-wise (binary operator gt).

hash_values([method, seed])

Compute the hash of values in this column.

head([n])

Return the first n rows.

host_deserialize(header, frames)

Perform device-side deserialization tasks.

host_serialize()

Serialize data and metadata associated with host memory.

interpolate([method, axis, limit, inplace, ...])

Interpolate data values between some points.

isin(values)

Check whether values are contained in Series.

isna()

Identify missing values.

isnull()

Identify missing values.

items()

Iteration is unsupported.

iteritems()

Iteration is unsupported.

keys()

Return alias for index.

kurt([axis, skipna, numeric_only])

Return Fisher's unbiased kurtosis of a sample.

kurtosis([axis, skipna, numeric_only])

Return Fisher's unbiased kurtosis of a sample.

last(offset)

Select final periods of time series data based on a date offset.

le(other[, level, fill_value, axis])

Get Less than or equal to of DataFrame or Series and other, element-wise (binary operator le).

lt(other[, level, fill_value, axis])

Get Less than of DataFrame or Series and other, element-wise (binary operator lt).

map(arg[, na_action])

Map values of Series according to input correspondence.

mask(cond[, other, inplace])

Replace values where the condition is True.

max([axis, skipna, numeric_only])

Return the maximum of the values in the DataFrame.

mean([axis, skipna, numeric_only])

Return the mean of the values for the requested axis.

median([axis, skipna, level, numeric_only])

Return the median of the values for the requested axis.

memory_usage([index, deep])

Return the memory usage of an object.

min([axis, skipna, numeric_only])

Return the minimum of the values in the DataFrame.

mod(other[, level, fill_value, axis])

Get Modulo of DataFrame or Series and other, element-wise (binary operator mod).

mode([dropna])

Return the mode(s) of the dataset.

mul(other[, level, fill_value, axis])

Get Multiplication of DataFrame or Series and other, element-wise (binary operator mul).

multiply(other[, level, fill_value, axis])

Get Multiplication of DataFrame or Series and other, element-wise (binary operator mul).

nans_to_nulls()

Convert nans (if any) to nulls

ne(other[, level, fill_value, axis])

Get Not equal to of DataFrame or Series and other, element-wise (binary operator ne).

nlargest([n, keep])

Returns a new Series of the n largest element.

notna()

Identify non-missing values.

notnull()

Identify non-missing values.

nsmallest([n, keep])

Returns a new Series of the n smallest element.

nunique([dropna])

Return count of unique values for the column.

pad([value, axis, inplace, limit])

Synonym for Series.fillna() with method='ffill'.

pct_change([periods, fill_method, limit, freq])

Calculates the percent change between sequential elements in the Series.

pipe(func, *args, **kwargs)

Apply func(self, *args, **kwargs).

pow(other[, level, fill_value, axis])

Get Exponential of DataFrame or Series and other, element-wise (binary operator pow).

prod([axis, skipna, dtype, numeric_only, ...])

Return product of the values in the DataFrame.

product([axis, skipna, dtype, numeric_only, ...])

Return product of the values in the DataFrame.

quantile([q, interpolation, exact, quant_index])

Return values at the given quantile.

radd(other[, level, fill_value, axis])

Get Addition of DataFrame or Series and other, element-wise (binary operator radd).

rank([axis, method, numeric_only, ...])

Compute numerical data ranks (1 through n) along axis.

rdiv(other[, level, fill_value, axis])

Get Floating division of DataFrame or Series and other, element-wise (binary operator rtruediv).

reindex(*args, **kwargs)

Conform Series to new index.

rename([index, copy])

Alter Series name

repeat(repeats[, axis])

Repeats elements consecutively.

replace([to_replace, value])

Replace values given in to_replace with value.

resample(rule[, axis, closed, label, ...])

Convert the frequency of ("resample") the given time series data.

reset_index([level, drop, name, inplace])

Reset the index of the Series, or a level of it.

rfloordiv(other[, level, fill_value, axis])

Get Integer division of DataFrame or Series and other, element-wise (binary operator rfloordiv).

rmod(other[, level, fill_value, axis])

Get Modulo of DataFrame or Series and other, element-wise (binary operator rmod).

rmul(other[, level, fill_value, axis])

Get Multiplication of DataFrame or Series and other, element-wise (binary operator rmul).

rolling(window[, min_periods, center, axis, ...])

Rolling window calculations.

round([decimals, how])

Round to a variable number of decimal places.

rpow(other[, level, fill_value, axis])

Get Exponential of DataFrame or Series and other, element-wise (binary operator rpow).

rsub(other[, level, fill_value, axis])

Get Subtraction of DataFrame or Series and other, element-wise (binary operator rsub).

rtruediv(other[, level, fill_value, axis])

Get Floating division of DataFrame or Series and other, element-wise (binary operator rtruediv).

sample([n, frac, replace, weights, ...])

Return a random sample of items from an axis of object.

scale()

Scale values to [0, 1] in float64

searchsorted(values[, side, ascending, ...])

Find indices where elements should be inserted to maintain order

serialize()

Generate an equivalent serializable representation of an object.

shift([periods, freq, axis, fill_value])

Shift values by periods positions.

skew([axis, skipna, numeric_only])

Return unbiased Fisher-Pearson skew of a sample.

sort_index([axis])

Sort object by labels (along an axis).

sort_values([axis, ascending, inplace, ...])

Sort by the values along either axis.

squeeze([axis])

Squeeze 1 dimensional axis objects into scalars.

std([axis, skipna, ddof, numeric_only])

Return sample standard deviation of the DataFrame.

sub(other[, level, fill_value, axis])

Get Subtraction of DataFrame or Series and other, element-wise (binary operator sub).

subtract(other[, level, fill_value, axis])

Get Subtraction of DataFrame or Series and other, element-wise (binary operator sub).

sum([axis, skipna, dtype, numeric_only, ...])

Return sum of the values in the DataFrame.

tail([n])

Returns the last n rows as a new DataFrame or Series

take(indices[, axis])

Return a new frame containing the rows specified by indices.

tile(count)

Repeats the rows count times to form a new Frame.

to_arrow()

Convert to a PyArrow Array.

to_cupy([dtype, copy, na_value])

Convert the Frame to a CuPy array.

to_dict([into])

Convert Series to {label -> value} dict or dict-like object.

to_dlpack()

Converts a cuDF object into a DLPack tensor.

to_frame([name])

Convert Series into a DataFrame

to_hdf(path_or_buf, key, *args, **kwargs)

Write the contained data to an HDF5 file using HDFStore.

to_json([path_or_buf])

Convert the cuDF object to a JSON string.

to_numpy([dtype, copy, na_value])

Convert the Frame to a NumPy array.

to_pandas(*[, index, nullable, arrow_type])

Convert to a pandas Series.

to_string()

Convert to string

transpose()

Return the transpose, which is by definition self.

truediv(other[, level, fill_value, axis])

Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).

truncate([before, after, axis, copy])

Truncate a Series or DataFrame before and after some index value.

unique()

Returns unique values of this Series.

update(other)

Modify Series in place using values from passed Series.

value_counts([normalize, sort, ascending, ...])

Return a Series containing counts of unique values.

var([axis, skipna, ddof, numeric_only])

Return unbiased variance of the DataFrame.

where(cond[, other, inplace])

Replace values where the condition is False.

to_list

tolist