cuml.preprocessing#

Binarizer

Binarize data (set feature values to 0 or 1) according to a threshold

FunctionTransformer

Constructs a transformer from an arbitrary callable.

KBinsDiscretizer

Bin continuous data into intervals.

KernelCenterer

Center a kernel matrix

LabelBinarizer

Binarize labels in a one-vs-all fashion.

LabelEncoder

Encode target labels with values between 0 and n_classes - 1.

MaxAbsScaler

Scale each feature by its maximum absolute value.

MinMaxScaler

Transform features by scaling each feature to a given range.

MissingIndicator

Binary indicators for missing values.

Normalizer

Normalize samples individually to unit norm.

OneHotEncoder

Encode categorical features as a one-hot numeric array.

PolynomialFeatures

Generate polynomial and interaction features.

PowerTransformer

Apply a power transform featurewise to make data more Gaussian-like.

QuantileTransformer

Transform features using quantiles information.

RobustScaler

Scale features using statistics that are robust to outliers.

SimpleImputer

Imputation transformer for completing missing values.

StandardScaler

Standardize features by removing the mean and scaling to unit variance

TargetEncoder

A cudf based implementation of target encoding [R331d970812b4-1], which converts one or multiple categorical variables, 'Xs', with the average of corresponding values of the target variable, 'Y'.

add_dummy_feature

Augment dataset with an additional dummy feature.

binarize

Boolean thresholding of array-like or sparse matrix

label_binarize

Binarize labels in a one-vs-all fashion.

maxabs_scale

Scale each feature to the [-1, 1] range without breaking the sparsity.

minmax_scale

Transform features by scaling each feature to a given range.

normalize

Scale input vectors individually to unit norm (vector length).

robust_scale

Standardize a dataset along any axis

scale

Standardize a dataset along any axis

Text Preprocessing#

PorterStemmer

A word stemmer based on the Porter stemming algorithm.