MissingIndicator#
- class cuml.preprocessing.MissingIndicator(*args, **kwargs)[source]#
Binary indicators for missing values.
Note that this component typically should not be used in a vanilla
Pipelineconsisting of transformers and a classifier, but rather could be added using aFeatureUnionorColumnTransformer.- Parameters:
- missing_valuesnumber, string, np.nan (default) or None
The placeholder for the missing values. All occurrences of
missing_valueswill be imputed. For pandas’ dataframes with nullable integer dtypes with missing values,missing_valuesshould be set tonp.nan, sincepd.NAwill be converted tonp.nan.- featuresstr, default=None
Whether the imputer mask should represent all or a subset of features.
If “missing-only” (default), the imputer mask will only represent features containing missing values during fit time.
If “all”, the imputer mask will represent all features.
- sparseboolean or “auto”, default=None
Whether the imputer mask format should be sparse or dense.
If “auto” (default), the imputer mask will be of same type as input.
If True, the imputer mask will be a sparse matrix.
If False, the imputer mask will be a numpy array.
- error_on_newboolean, default=None
If True (default), transform will raise an error when there are features with missing values in transform that have no missing values in fit. This is applicable only when
features="missing-only".
- Attributes:
- features_ndarray, shape (n_missing_features,) or (n_features,)
The features indices which will be returned when calling
transform. They are computed duringfit. Forfeatures='all', it is torange(n_features).
Methods
fit(X[, y])Fit the transformer on X.
fit_transform(X[, y])Generate missing values indicator for X.
transform(X)Generate missing values indicator for X.
Examples
>>> import numpy as np >>> from sklearn.impute import MissingIndicator >>> X1 = np.array([[np.nan, 1, 3], ... [4, 0, np.nan], ... [8, 1, 0]]) >>> X2 = np.array([[5, 1, np.nan], ... [np.nan, 2, 3], ... [2, 4, 0]]) >>> indicator = MissingIndicator() >>> indicator.fit(X1) MissingIndicator() >>> X2_tr = indicator.transform(X2) >>> X2_tr array([[False, True], [ True, False], [False, False]])
- fit(X, y=None) MissingIndicator[source]#
Fit the transformer on X.
- Parameters:
- X{array-like, sparse matrix}, shape (n_samples, n_features)
Input data, where
n_samplesis the number of samples andn_featuresis the number of features.
- Returns:
- selfobject
Returns self.
- fit_transform(X, y=None) SparseCumlArray[source]#
Generate missing values indicator for X.
- Parameters:
- X{array-like, sparse matrix}, shape (n_samples, n_features)
The input data to complete.
- Returns:
- Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing)
The missing indicator for input data. The data type of
Xtwill be boolean.
- transform(X) SparseCumlArray[source]#
Generate missing values indicator for X.
- Parameters:
- X{array-like, sparse matrix}, shape (n_samples, n_features)
The input data to complete.
- Returns:
- Xt{ndarray or sparse matrix}, shape (n_samples, n_features) or (n_samples, n_features_with_missing)
The missing indicator for input data. The data type of
Xtwill be boolean.