LabelBinarizer#
- class cuml.preprocessing.LabelBinarizer(*, neg_label=0, pos_label=1, sparse_output=False, verbose=False, output_type=None)[source]#
Binarize labels in a one-vs-all fashion.
- Parameters:
- neg_labelint, default=0
The value to use for encoding negative labels.
- pos_labelint, default=1
The value to use for encoding positive labels.
- sparse_outputbool, default=False
If true, a sparse CSR matrix is returned from
transform.- verboseint or boolean, default=False
Sets logging level. It must be one of
cuml.common.logger.level_*. See Verbosity Levels for more info.- output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None
Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (
cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.
- Attributes:
- classes_numpy.ndarray of shape (n_classes,)
Holds the label for each class.
- y_type_{‘binary’, ‘multiclass’, ‘multilabel-indicator’}
The type of the target data.
- sparse_input_bool
Whether the input data to
fitwas a sparse matrix.
Methods
fit(y)Fit label binarizer.
Fit label binarizer and transform labels to binary labels.
inverse_transform(y, *[, threshold])Transform binary labels back to original labels.
transform(y)Transform labels to binary labels.
See also
label_binarizeA function version of this class.
Examples
>>> import cupy as cp >>> from cuml.preprocessing import LabelBinarizer >>> y = cp.array([1, 2, 6, 4, 2]) >>> lb = LabelBinarizer().fit(y) >>> lb.classes_ array([1, 2, 4, 6]) >>> lb.transform(cp.array([1, 6])) array([[1, 0, 0, 0], [0, 0, 0, 1]], dtype=int32)
Binary targets result in a column vector:
>>> import numpy as np >>> lb = LabelBinarizer() >>> lb.fit_transform(np.array(['a', 'b', 'b', 'a'])) array([[0], [1], [1], [0]], dtype=int32)
- fit(y) LabelBinarizer[source]#
Fit label binarizer.
- Parameters:
- yarray of shape [n_samples,] or [n_samples, n_classes]
Target values. The 2-d matrix should only contain 0 and 1, in the multilabel-indicator format.
- Returns:
- selfLabelBinarizer
Returns the instance itself.
- fit_transform(y)[source]#
Fit label binarizer and transform labels to binary labels.
- Parameters:
- yarray-like or sparse matrix, shape (n_samples,) or (n_samples, n_classes)
Target values. The 2-d matrix should only contain 0 and 1, in the multilabel-indicator format.
- Returns:
- yarray or sparse matrix
The encoded labels. Shape will be (n_samples, 1) for binary classification problems. Will be a sparse matrix if
sparse_output=True.
- inverse_transform(y, *, threshold=None)[source]#
Transform binary labels back to original labels.
- Parameters:
- yarray-like or sparse matrix, shape (n_samples, n_classes)
The encoded target values.
- thresholdfloat, default=None
Threshold used in the binary and multilabel-indicator cases. If None, the threshold is assumed to be half way between
neg_labelandpos_label.
- Returns:
- yarray or sparse matrix, shape (n_samples,) or (n_samples, n_classes)
The original target values.
- transform(y)[source]#
Transform labels to binary labels.
- Parameters:
- yarray-like or sparse matrix, shape (n_samples,) or (n_samples, n_classes)
Target values. The 2-d matrix should only contain 0 and 1, in the multilabel-indicator format.
- Returns:
- yarray or sparse matrix
The encoded labels. Will be a sparse matrix if
sparse_output=True. Shape will be (n_samples, n_classes) for multiclass problems, (n_samples, 1) for binary problems with no unseen classes, and (n_samples, 2) for binary problems with unseen classes (a minor, intentional deviation from sklearn).