LabelBinarizer#

class cuml.preprocessing.LabelBinarizer(*, neg_label=0, pos_label=1, sparse_output=False, verbose=False, output_type=None)[source]#

Binarize labels in a one-vs-all fashion.

Parameters:
neg_labelint, default=0

The value to use for encoding negative labels.

pos_labelint, default=1

The value to use for encoding positive labels.

sparse_outputbool, default=False

If true, a sparse CSR matrix is returned from transform.

verboseint or boolean, default=False

Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None

Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

Attributes:
classes_numpy.ndarray of shape (n_classes,)

Holds the label for each class.

y_type_{‘binary’, ‘multiclass’, ‘multilabel-indicator’}

The type of the target data.

sparse_input_bool

Whether the input data to fit was a sparse matrix.

Methods

fit(y)

Fit label binarizer.

fit_transform(y)

Fit label binarizer and transform labels to binary labels.

inverse_transform(y, *[, threshold])

Transform binary labels back to original labels.

transform(y)

Transform labels to binary labels.

See also

label_binarize

A function version of this class.

Examples

>>> import cupy as cp
>>> from cuml.preprocessing import LabelBinarizer
>>> y = cp.array([1, 2, 6, 4, 2])
>>> lb = LabelBinarizer().fit(y)
>>> lb.classes_
array([1, 2, 4, 6])
>>> lb.transform(cp.array([1, 6]))
array([[1, 0, 0, 0],
       [0, 0, 0, 1]], dtype=int32)

Binary targets result in a column vector:

>>> import numpy as np
>>> lb = LabelBinarizer()
>>> lb.fit_transform(np.array(['a', 'b', 'b', 'a']))
array([[0],
       [1],
       [1],
       [0]], dtype=int32)
fit(y) LabelBinarizer[source]#

Fit label binarizer.

Parameters:
yarray of shape [n_samples,] or [n_samples, n_classes]

Target values. The 2-d matrix should only contain 0 and 1, in the multilabel-indicator format.

Returns:
selfLabelBinarizer

Returns the instance itself.

fit_transform(y)[source]#

Fit label binarizer and transform labels to binary labels.

Parameters:
yarray-like or sparse matrix, shape (n_samples,) or (n_samples, n_classes)

Target values. The 2-d matrix should only contain 0 and 1, in the multilabel-indicator format.

Returns:
yarray or sparse matrix

The encoded labels. Shape will be (n_samples, 1) for binary classification problems. Will be a sparse matrix if sparse_output=True.

inverse_transform(y, *, threshold=None)[source]#

Transform binary labels back to original labels.

Parameters:
yarray-like or sparse matrix, shape (n_samples, n_classes)

The encoded target values.

thresholdfloat, default=None

Threshold used in the binary and multilabel-indicator cases. If None, the threshold is assumed to be half way between neg_label and pos_label.

Returns:
yarray or sparse matrix, shape (n_samples,) or (n_samples, n_classes)

The original target values.

transform(y)[source]#

Transform labels to binary labels.

Parameters:
yarray-like or sparse matrix, shape (n_samples,) or (n_samples, n_classes)

Target values. The 2-d matrix should only contain 0 and 1, in the multilabel-indicator format.

Returns:
yarray or sparse matrix

The encoded labels. Will be a sparse matrix if sparse_output=True. Shape will be (n_samples, n_classes) for multiclass problems, (n_samples, 1) for binary problems with no unseen classes, and (n_samples, 2) for binary problems with unseen classes (a minor, intentional deviation from sklearn).