cudf.Series#

class cudf.Series(data=None, index=None, dtype=None, name=None, copy=False, nan_as_null=True)#

One-dimensional GPU array (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as null/NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values, they need not be the same length. The result index will be the sorted union of the two indexes.

Series objects are used as columns of DataFrame.

Parameters:

dataarray-like, Iterable, dict, or scalar value: Contains data stored in Series.
indexarray-like or Index (1d): Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If both a dict and index sequence are used, the index will override the keys found in the dict.
dtypestr, numpy.dtype, or ExtensionDtype, optional: Data type for the output Series. If not specified, this will be inferred from data.
namestr, optional: The name to give to the Series.
copybool, default False: Copy input data. Only affects Series or 1d ndarray input.
nan_as_nullbool, Default True: If None/True, converts np.nan values to null values. If False, leaves np.nan values as is.

Attributes

`T`	Return the transpose, which is by definition self.
`axes`	Return a list representing the axes of the Series.
`cat`	Accessor object for categorical properties of the Series values.
`data`	The gpu buffer for the data
`dt`	Accessor object for datetime-like properties of the Series values.
`dtype`	The dtype of the Series.
`empty`	Indicator whether DataFrame or Series is empty.
`has_nulls`	Indicator whether Series contains null values.
`hasnans`	Return True if there are any NaNs or nulls.
`index`	Get the labels for the rows.
`is_monotonic_decreasing`	Return boolean if values in the object are monotonically decreasing.
`is_monotonic_increasing`	Return boolean if values in the object are monotonically increasing.
`is_unique`	Return boolean if values in the object are unique.
`list`	List methods for Series
`name`	Get the name of this object.
`ndim`	Number of dimensions of the underlying data, by definition 1.
`null_count`	Number of null values
`nullable`	A boolean indicating whether a null-mask is needed
`nullmask`	The gpu buffer for the null-mask
`shape`	Get a tuple representing the dimensionality of the Index.
`size`	Return the number of elements in the underlying data.
`str`	Vectorized string functions for Series and Index.
`struct`	Struct methods for Series
`valid_count`	Number of non-null values
`values`	Return a CuPy representation of the DataFrame.
`values_host`	Return a NumPy representation of the data.

iloc

Select values by position. Examples ——– Series >>> import cudf >>> s = cudf.Series([10, 20, 30]) >>> s 0 10 1 20 2 30 dtype: int64 >>> s.iloc[2] 30 DataFrame Selecting rows and column by position. >>> df = cudf.DataFrame({‘a’: range(20), … ‘b’: range(20), … ‘c’: range(20)}) Select a single row using an integer index. >>> df.iloc[1] a 1 b 1 c 1 Name: 1, dtype: int64 Select multiple rows using a list of integers. >>> df.iloc[[0, 2, 9, 18]] a b c 0 0 0 0 2 2 2 2 9 9 9 9 18 18 18 18 Select rows using a slice. >>> df.iloc[3:10:2] a b c 3 3 3 3 5 5 5 5 7 7 7 7 9 9 9 9 Select both rows and columns. >>> df.iloc[[1, 3, 5, 7], 2] 1 1 3 3 5 5 7 7 Name: c, dtype: int64 Setting values in a column using iloc. >>> df.iloc[:4] = 0 >>> df a b c 0 0 0 0 1 0 0 0 2 0 0 0 3 0 0 0 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8 9 9 9 9 [10 more rows]

loc

Select rows and columns by label or boolean mask. Examples ——– Series >>> import cudf >>> series = cudf.Series([10, 11, 12], index=[‘a’, ‘b’, ‘c’]) >>> series a 10 b 11 c 12 dtype: int64 >>> series.loc[‘b’] 11 DataFrame DataFrame with string index. >>> df a b a 0 5 b 1 6 c 2 7 d 3 8 e 4 9 Select a single row by label. >>> df.loc[‘a’] a 0 b 5 Name: a, dtype: int64 Select multiple rows and a single column. >>> df.loc[[‘a’, ‘c’, ‘e’], ‘b’] a 5 c 7 e 9 Name: b, dtype: int64 Selection by boolean mask. >>> df.loc[df.a > 2] a b d 3 8 e 4 9 Setting values using loc. >>> df.loc[[‘a’, ‘c’, ‘e’], ‘a’] = 0 >>> df a b a 0 5 b 1 6 c 0 7 d 3 8 e 0 9

Methods

`abs`()	Return a Series/DataFrame with absolute numeric value of each element.
`add`(other[, level, fill_value, axis])	Get Addition of DataFrame or Series and other, element-wise (binary operator add).
`add_prefix`(prefix)	Prefix labels with string prefix.
`add_suffix`(suffix)	Suffix labels with string suffix.
`all`([axis, bool_only, skipna])	Return whether all elements are True in DataFrame.
`any`([axis, bool_only, skipna])	Return whether any elements is True in DataFrame.
`apply`(func[, convert_dtype, args])	Apply a scalar function to the values of a Series.
`argsort`([axis, kind, order, ascending, ...])	Return the integer indices that would sort the Series values.
`astype`(dtype[, copy, errors])	Cast the object to the given dtype.
`autocorr`([lag])	Compute the lag-N autocorrelation.
`backfill`([value, axis, inplace, limit])	Synonym for `Series.fillna()` with `method='bfill'`.
`between`(left, right[, inclusive])	Return boolean Series equivalent to left <= series <= right.
`bfill`([value, axis, inplace, limit])	Synonym for `Series.fillna()` with `method='bfill'`.
`clip`([lower, upper, inplace, axis])	Trim values at input threshold(s).
`convert_dtypes`([infer_objects, ...])	Convert columns to the best possible nullable dtypes.
`copy`([deep])	Make a copy of this object's indices and data.
`corr`(other[, method, min_periods])	Calculates the sample correlation between two Series, excluding missing values.
`count`()	Return number of non-NA/null observations in the Series
`cov`(other[, min_periods])	Compute covariance with Series, excluding missing values.
`cummax`([axis, skipna])	Return cumulative max of the Series.
`cummin`([axis, skipna])	Return cumulative min of the Series.
`cumprod`([axis, skipna])	Return cumulative product of the Series.
`cumsum`([axis, skipna])	Return cumulative sum of the Series.
`describe`([percentiles, include, exclude])	Generate descriptive statistics.
`deserialize`(header, frames)	Generate an object from a serialized representation.
`device_deserialize`(header, frames)	Perform device-side deserialization tasks.
`device_serialize`()	Serialize data and metadata associated with device memory.
`diff`([periods])	First discrete difference of element.
`digitize`(bins[, right])	Return the indices of the bins to which each value belongs.
`div`(other[, level, fill_value, axis])	Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).
`divide`(other[, level, fill_value, axis])	Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).
`dot`(other[, reflect])	Get dot product of frame and other, (binary operator dot).
`drop`([labels, axis, index, columns, level, ...])	Drop specified labels from rows or columns.
`drop_duplicates`([keep, inplace, ignore_index])	Return Series with duplicate values removed.
`dropna`([axis, inplace, how])	Return a Series with null values removed.
`duplicated`([keep])	Indicate duplicate Series values.
`eq`(other[, level, fill_value, axis])	Get Equal to of DataFrame or Series and other, element-wise (binary operator eq).
`equals`(other)	Test whether two objects contain the same elements.
`explode`([ignore_index])	Transform each element of a list-like to a row, replicating index values.
`factorize`([sort, use_na_sentinel])	Encode the input values as integer labels.
`ffill`([value, axis, inplace, limit])	Synonym for `Series.fillna()` with `method='ffill'`.
`fillna`([value, method, axis, inplace, limit])	Fill null values with `value` or specified `method`.
`first`(offset)	Select initial periods of time series data based on a date offset.
`floordiv`(other[, level, fill_value, axis])	Get Integer division of DataFrame or Series and other, element-wise (binary operator floordiv).
`from_arrow`(array)	Create from PyArrow Array/ChunkedArray.
`from_categorical`(categorical[, codes])	Creates from a pandas.Categorical
`from_masked_array`(data, mask[, null_count])	Create a Series with null-mask.
`from_pandas`(s[, nan_as_null])	Convert from a Pandas Series.
`ge`(other[, level, fill_value, axis])	Get Greater than or equal to of DataFrame or Series and other, element-wise (binary operator ge).
`groupby`([by, axis, level, as_index, sort, ...])	Group using a mapper or by a Series of columns.
`gt`(other[, level, fill_value, axis])	Get Greater than of DataFrame or Series and other, element-wise (binary operator gt).
`hash_values`([method, seed])	Compute the hash of values in this column.
`head`([n])	Return the first n rows.
`host_deserialize`(header, frames)	Perform device-side deserialization tasks.
`host_serialize`()	Serialize data and metadata associated with host memory.
`interpolate`([method, axis, limit, inplace, ...])	Interpolate data values between some points.
`isin`(values)	Check whether values are contained in Series.
`isna`()	Identify missing values.
`isnull`()	Identify missing values.
`items`()	Iteration is unsupported.
`iteritems`()	Iteration is unsupported.
`keys`()	Return alias for index.
`kurt`([axis, skipna, numeric_only])	Return Fisher's unbiased kurtosis of a sample.
`kurtosis`([axis, skipna, numeric_only])	Return Fisher's unbiased kurtosis of a sample.
`last`(offset)	Select final periods of time series data based on a date offset.
`le`(other[, level, fill_value, axis])	Get Less than or equal to of DataFrame or Series and other, element-wise (binary operator le).
`lt`(other[, level, fill_value, axis])	Get Less than of DataFrame or Series and other, element-wise (binary operator lt).
`map`(arg[, na_action])	Map values of Series according to input correspondence.
`mask`(cond[, other, inplace])	Replace values where the condition is True.
`max`([axis, skipna, numeric_only])	Return the maximum of the values in the DataFrame.
`mean`([axis, skipna, numeric_only])	Return the mean of the values for the requested axis.
`median`([axis, skipna, level, numeric_only])	Return the median of the values for the requested axis.
`memory_usage`([index, deep])	Return the memory usage of an object.
`min`([axis, skipna, numeric_only])	Return the minimum of the values in the DataFrame.
`mod`(other[, level, fill_value, axis])	Get Modulo of DataFrame or Series and other, element-wise (binary operator mod).
`mode`([dropna])	Return the mode(s) of the dataset.
`mul`(other[, level, fill_value, axis])	Get Multiplication of DataFrame or Series and other, element-wise (binary operator mul).
`multiply`(other[, level, fill_value, axis])	Get Multiplication of DataFrame or Series and other, element-wise (binary operator mul).
`nans_to_nulls`()	Convert nans (if any) to nulls
`ne`(other[, level, fill_value, axis])	Get Not equal to of DataFrame or Series and other, element-wise (binary operator ne).
`nlargest`([n, keep])	Returns a new Series of the n largest element.
`notna`()	Identify non-missing values.
`notnull`()	Identify non-missing values.
`nsmallest`([n, keep])	Returns a new Series of the n smallest element.
`nunique`([dropna])	Return count of unique values for the column.
`pad`([value, axis, inplace, limit])	Synonym for `Series.fillna()` with `method='ffill'`.
`pct_change`([periods, fill_method, limit, freq])	Calculates the percent change between sequential elements in the Series.
`pipe`(func, args, *kwargs)	Apply `func(self, args, *kwargs)`.
`pow`(other[, level, fill_value, axis])	Get Exponential of DataFrame or Series and other, element-wise (binary operator pow).
`prod`([axis, skipna, dtype, numeric_only, ...])	Return product of the values in the DataFrame.
`product`([axis, skipna, dtype, numeric_only, ...])	Return product of the values in the DataFrame.
`quantile`([q, interpolation, exact, quant_index])	Return values at the given quantile.
`radd`(other[, level, fill_value, axis])	Get Addition of DataFrame or Series and other, element-wise (binary operator radd).
`rank`([axis, method, numeric_only, ...])	Compute numerical data ranks (1 through n) along axis.
`rdiv`(other[, level, fill_value, axis])	Get Floating division of DataFrame or Series and other, element-wise (binary operator rtruediv).
`reindex`(args, *kwargs)	Conform Series to new index.
`rename`([index, copy])	Alter Series name
`repeat`(repeats[, axis])	Repeats elements consecutively.
`replace`([to_replace, value])	Replace values given in `to_replace` with `value`.
`resample`(rule[, axis, closed, label, ...])	Convert the frequency of ("resample") the given time series data.
`reset_index`([level, drop, name, inplace])	Reset the index of the Series, or a level of it.
`rfloordiv`(other[, level, fill_value, axis])	Get Integer division of DataFrame or Series and other, element-wise (binary operator rfloordiv).
`rmod`(other[, level, fill_value, axis])	Get Modulo of DataFrame or Series and other, element-wise (binary operator rmod).
`rmul`(other[, level, fill_value, axis])	Get Multiplication of DataFrame or Series and other, element-wise (binary operator rmul).
`rolling`(window[, min_periods, center, axis, ...])	Rolling window calculations.
`round`([decimals, how])	Round to a variable number of decimal places.
`rpow`(other[, level, fill_value, axis])	Get Exponential of DataFrame or Series and other, element-wise (binary operator rpow).
`rsub`(other[, level, fill_value, axis])	Get Subtraction of DataFrame or Series and other, element-wise (binary operator rsub).
`rtruediv`(other[, level, fill_value, axis])	Get Floating division of DataFrame or Series and other, element-wise (binary operator rtruediv).
`sample`([n, frac, replace, weights, ...])	Return a random sample of items from an axis of object.
`scale`()	Scale values to [0, 1] in float64
`searchsorted`(values[, side, ascending, ...])	Find indices where elements should be inserted to maintain order
`serialize`()	Generate an equivalent serializable representation of an object.
`shift`([periods, freq, axis, fill_value])	Shift values by periods positions.
`skew`([axis, skipna, numeric_only])	Return unbiased Fisher-Pearson skew of a sample.
`sort_index`([axis])	Sort object by labels (along an axis).
`sort_values`([axis, ascending, inplace, ...])	Sort by the values along either axis.
`squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`std`([axis, skipna, ddof, numeric_only])	Return sample standard deviation of the DataFrame.
`sub`(other[, level, fill_value, axis])	Get Subtraction of DataFrame or Series and other, element-wise (binary operator sub).
`subtract`(other[, level, fill_value, axis])	Get Subtraction of DataFrame or Series and other, element-wise (binary operator sub).
`sum`([axis, skipna, dtype, numeric_only, ...])	Return sum of the values in the DataFrame.
`tail`([n])	Returns the last n rows as a new DataFrame or Series
`take`(indices[, axis])	Return a new frame containing the rows specified by indices.
`tile`(count)	Repeats the rows count times to form a new Frame.
`to_arrow`()	Convert to a PyArrow Array.
`to_cupy`([dtype, copy, na_value])	Convert the Frame to a CuPy array.
`to_dict`([into])	Convert Series to {label -> value} dict or dict-like object.
`to_dlpack`()	Converts a cuDF object into a DLPack tensor.
`to_frame`([name])	Convert Series into a DataFrame
`to_hdf`(path_or_buf, key, args, *kwargs)	Write the contained data to an HDF5 file using HDFStore.
`to_json`([path_or_buf])	Convert the cuDF object to a JSON string.
`to_numpy`([dtype, copy, na_value])	Convert the Frame to a NumPy array.
`to_pandas`(*[, index, nullable, arrow_type])	Convert to a pandas Series.
`to_string`()	Convert to string
`transpose`()	Return the transpose, which is by definition self.
`truediv`(other[, level, fill_value, axis])	Get Floating division of DataFrame or Series and other, element-wise (binary operator truediv).
`truncate`([before, after, axis, copy])	Truncate a Series or DataFrame before and after some index value.
`unique`()	Returns unique values of this Series.
`update`(other)	Modify Series in place using values from passed Series.
`value_counts`([normalize, sort, ascending, ...])	Return a Series containing counts of unique values.
`var`([axis, skipna, ddof, numeric_only])	Return unbiased variance of the DataFrame.
`where`(cond[, other, inplace])	Replace values where the condition is False.

to_list
tolist