scale#
- cuml.preprocessing.scale(X, *, axis=0, with_mean=True, with_std=True, copy=True)[source]#
Standardize a dataset along any axis
Center to the mean and component wise scale to unit variance.
- Parameters:
- X{array-like, sparse matrix}
The data to center and scale.
- axisint (0 by default)
axis used to compute the means and standard deviations along. If 0, independently standardize each feature, otherwise (if 1) standardize each sample.
- with_meanboolean, True by default
If True, center the data before scaling.
- with_stdboolean, True by default
If True, scale the data to unit variance (or equivalently, unit standard deviation).
- copyboolean, optional, default True
Whether a forced copy will be triggered. If copy=False, a copy might be triggered by a conversion.
See also
StandardScalerPerforms scaling to unit variance using the``Transformer`` API
Notes
This implementation will refuse to center sparse matrices since it would make them non-sparse and would potentially crash the program with memory exhaustion problems.
Instead the caller is expected to either set explicitly
with_mean=False(in that case, only variance scaling will be performed on the features of the sparse matrix) or to densify the matrix if he/she expects the materialized dense array to fit in memory.For optimal processing the caller should pass a CSC matrix.
NaNs are treated as missing values: disregarded to compute the statistics, and maintained during the data transformation.
We use a biased estimator for the standard deviation, equivalent to
numpy.std(x, ddof=0). Note that the choice ofddofis unlikely to affect model performance.