EmpiricalCovariance#

class cuml.covariance.EmpiricalCovariance(*, store_precision=True, assume_centered=False, verbose=False, output_type=None)[source]#

Maximum likelihood covariance estimator.

Parameters:
store_precisionbool, default=True

Specifies if the estimated precision matrix is stored.

assume_centeredbool, default=False

If True, data will not be centered before computation. Useful when working with data whose mean is almost, but not exactly zero. If False (default), data will be centered before computation.

verboseint or boolean, default=False

Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None

Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

Attributes:
covariance_ndarray of shape (n_features, n_features)

Estimated covariance matrix.

location_ndarray of shape (n_features,)

Estimated location, i.e., the estimated mean.

precision_ndarray of shape (n_features, n_features)

Estimated pseudo inverse matrix. Only stored if store_precision is True.

n_features_in_int

Number of features seen during fit.

Methods

error_norm(comp_cov[, norm, scaling, squared])

Compute the Mean Squared Error between two covariance estimators.

fit(X[, y, convert_dtype])

Fit the maximum likelihood covariance estimator to X.

get_precision()

Getter for the precision matrix.

mahalanobis(X)

Compute the squared Mahalanobis distances of given observations.

score(X_test[, y])

Compute the log-likelihood of X_test under the estimated model.

See also

sklearn.covariance.EmpiricalCovariance

The scikit-learn CPU implementation.

Examples

>>> import cupy as cp
>>> from cuml.covariance import EmpiricalCovariance
>>> rng = cp.random.RandomState(42)
>>> X = rng.randn(100, 5)
>>> cov = EmpiricalCovariance().fit(X)
>>> cov.covariance_.shape
(5, 5)
error_norm(comp_cov, norm='frobenius', scaling=True, squared=True)[source]#

Compute the Mean Squared Error between two covariance estimators.

Parameters:
comp_covarray-like of shape (n_features, n_features)

The covariance to compare with.

norm{“frobenius”, “spectral”}, default=”frobenius”

The type of norm used to compute the error.

scalingbool, default=True

If True, the squared error is scaled by n_features.

squaredbool, default=True

If True, return squared error. If False, return error.

Returns:
errorfloat

The Mean Squared Error (in the sense of the Frobenius norm) between self and comp_cov.

fit(X, y=None, *, convert_dtype=True) EmpiricalCovariance[source]#

Fit the maximum likelihood covariance estimator to X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

yIgnored

Not used, present for API consistency.

convert_dtypebool, default=True

If True, convert the input data to float32.

Returns:
selfEmpiricalCovariance

Returns the instance itself.

get_precision()[source]#

Getter for the precision matrix.

Returns:
precision_ndarray of shape (n_features, n_features)

The precision matrix associated to the current covariance object.

mahalanobis(X)[source]#

Compute the squared Mahalanobis distances of given observations.

Parameters:
Xarray-like of shape (n_samples, n_features)

The observations, the Mahalanobis distances of which we compute.

Returns:
mahalanobis_distancesndarray of shape (n_samples,)

Squared Mahalanobis distances of the observations.

score(X_test, y=None) float[source]#

Compute the log-likelihood of X_test under the estimated model.

Parameters:
X_testarray-like of shape (n_samples, n_features)

Test data of which we compute the likelihood.

yIgnored

Not used, present for API consistency.

Returns:
log_likelihoodfloat

Log-likelihood of the data under the fitted Gaussian model.