LedoitWolf#
- class cuml.covariance.LedoitWolf(*, store_precision=True, assume_centered=False, block_size=1000, verbose=False, output_type=None)[source]#
LedoitWolf Estimator for covariance matrix estimation.
Computes the Ledoit-Wolf shrinkage estimator for the covariance matrix. This estimator regularizes the empirical covariance by shrinking it towards a scaled identity matrix, with the shrinkage coefficient determined by the Ledoit-Wolf formula.
The regularized covariance is:
(1 - shrinkage) * cov + shrinkage * mu * np.identity(n_features)where
mu = trace(cov) / n_featuresandshrinkageis computed to minimize the Mean Squared Error between the regularized estimate and the true covariance.- Parameters:
- store_precisionbool, default=True
Specifies if the estimated precision matrix is stored.
- assume_centeredbool, default=False
If True, data will not be centered before computation. Useful when working with data whose mean is almost, but not exactly zero. If False (default), data will be centered before computation.
- block_sizeint, default=1000
Size of blocks into which the covariance matrix will be split during its Ledoit-Wolf estimation. This is purely a memory optimization and does not affect results.
- verboseint or boolean, default=False
Sets logging level. It must be one of
cuml.common.logger.level_*. See Verbosity Levels for more info.- output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None
Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (
cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.
- Attributes:
- covariance_ndarray of shape (n_features, n_features)
Estimated covariance matrix.
- location_ndarray of shape (n_features,)
Estimated location, i.e., the estimated mean.
- precision_ndarray of shape (n_features, n_features)
Estimated pseudo inverse matrix. Only stored if
store_precisionis True.- shrinkage_float
Coefficient in the convex combination used for the computation of the shrunk estimate. Range is [0, 1].
- n_features_in_int
Number of features seen during fit.
Methods
error_norm(comp_cov[, norm, scaling, squared])Compute the Mean Squared Error between two covariance estimators.
fit(X[, y, convert_dtype])Fit the Ledoit-Wolf shrunk covariance model to X.
Getter for the precision matrix.
mahalanobis(X)Compute the squared Mahalanobis distances of given observations.
score(X_test[, y])Compute the log-likelihood of X_test under the estimated model.
See also
sklearn.covariance.LedoitWolfThe scikit-learn CPU implementation.
References
O. Ledoit and M. Wolf, “A Well-Conditioned Estimator for Large-Dimensional Covariance Matrices”, Journal of Multivariate Analysis, Volume 88, Issue 2, February 2004, pages 365-411.
Examples
>>> import cupy as cp >>> from cuml.covariance import LedoitWolf >>> rng = cp.random.RandomState(42) >>> X = rng.randn(100, 5) >>> lw = LedoitWolf().fit(X) >>> lw.covariance_.shape (5, 5) >>> lw.shrinkage_ 0.123...
- error_norm(comp_cov, norm='frobenius', scaling=True, squared=True)[source]#
Compute the Mean Squared Error between two covariance estimators.
- Parameters:
- comp_covarray-like of shape (n_features, n_features)
The covariance to compare with.
- norm{“frobenius”, “spectral”}, default=”frobenius”
The type of norm used to compute the error.
- scalingbool, default=True
If True, the squared error is scaled by n_features.
- squaredbool, default=True
If True, return squared error. If False, return error.
- Returns:
- errorfloat
The Mean Squared Error (in the sense of the Frobenius norm) between
selfandcomp_cov.
- fit(X, y=None, *, convert_dtype=True) LedoitWolf[source]#
Fit the Ledoit-Wolf shrunk covariance model to X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Training data, where
n_samplesis the number of samples andn_featuresis the number of features.- yIgnored
Not used, present for API consistency.
- convert_dtypebool, default=True
If True, convert the input data to float32.
- Returns:
- selfLedoitWolf
Returns the instance itself.
- get_precision()[source]#
Getter for the precision matrix.
- Returns:
- precision_ndarray of shape (n_features, n_features)
The precision matrix associated to the current covariance object.
- mahalanobis(X)[source]#
Compute the squared Mahalanobis distances of given observations.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
The observations, the Mahalanobis distances of which we compute.
- Returns:
- mahalanobis_distancesndarray of shape (n_samples,)
Squared Mahalanobis distances of the observations.
- score(X_test, y=None) float[source]#
Compute the log-likelihood of X_test under the estimated model.
The log-likelihood is computed using the Gaussian model.
- Parameters:
- X_testarray-like of shape (n_samples, n_features)
Test data of which we compute the likelihood.
- yIgnored
Not used, present for API consistency.
- Returns:
- log_likelihoodfloat
Log-likelihood of the data under the fitted Gaussian model.