EmpiricalCovariance#

class cuml.covariance.EmpiricalCovariance(*, store_precision=True, assume_centered=False, verbose=False, output_type=None)[source]#

Maximum likelihood covariance estimator.

Parameters:

store_precisionbool, default=True: Specifies if the estimated precision matrix is stored.
assume_centeredbool, default=False: If True, data will not be centered before computation. Useful when working with data whose mean is almost, but not exactly zero. If False (default), data will be centered before computation.
verboseint or boolean, default=False: Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.
output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None: Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

Attributes:

covariance_ndarray of shape (n_features, n_features): Estimated covariance matrix.
location_ndarray of shape (n_features,): Estimated location, i.e., the estimated mean.
precision_ndarray of shape (n_features, n_features): Estimated pseudo inverse matrix. Only stored if store_precision is True.
n_features_in_int: Number of features seen during fit.

Methods

`error_norm`(comp_cov[, norm, scaling, squared])	Compute the Mean Squared Error between two covariance estimators.
`fit`(X[, y, convert_dtype])	Fit the maximum likelihood covariance estimator to X.
`get_precision`()	Getter for the precision matrix.
`mahalanobis`(X)	Compute the squared Mahalanobis distances of given observations.
`score`(X_test[, y])	Compute the log-likelihood of X_test under the estimated model.

See also

sklearn.covariance.EmpiricalCovariance: The scikit-learn CPU implementation.

Examples

>>> import cupy as cp
>>> from cuml.covariance import EmpiricalCovariance
>>> rng = cp.random.RandomState(42)
>>> X = rng.randn(100, 5)
>>> cov = EmpiricalCovariance().fit(X)
>>> cov.covariance_.shape
(5, 5)

error_norm(comp_cov, norm='frobenius', scaling=True, squared=True)[source]#

Compute the Mean Squared Error between two covariance estimators.

Parameters:

comp_covarray-like of shape (n_features, n_features): The covariance to compare with.
norm{“frobenius”, “spectral”}, default=”frobenius”: The type of norm used to compute the error.
scalingbool, default=True: If True, the squared error is scaled by n_features.
squaredbool, default=True: If True, return squared error. If False, return error.

Returns:

errorfloat: The Mean Squared Error (in the sense of the Frobenius norm) between self and comp_cov.

fit(X, y=None, *, convert_dtype=True) → EmpiricalCovariance[source]#

Fit the maximum likelihood covariance estimator to X.

Parameters:

Xarray-like of shape (n_samples, n_features): Training data, where n_samples is the number of samples and n_features is the number of features.
yIgnored: Not used, present for API consistency.
convert_dtypebool, default=True: If True, convert the input data to float32.

Returns:

selfEmpiricalCovariance: Returns the instance itself.

get_precision()[source]#

Getter for the precision matrix.

Returns:

precision_ndarray of shape (n_features, n_features): The precision matrix associated to the current covariance object.

mahalanobis(X)[source]#

Compute the squared Mahalanobis distances of given observations.

Parameters:

Xarray-like of shape (n_samples, n_features): The observations, the Mahalanobis distances of which we compute.

Returns:

mahalanobis_distancesndarray of shape (n_samples,): Squared Mahalanobis distances of the observations.

score(X_test, y=None) → float[source]#

Compute the log-likelihood of X_test under the estimated model.

Parameters:

X_testarray-like of shape (n_samples, n_features): Test data of which we compute the likelihood.
yIgnored: Not used, present for API consistency.

Returns:

log_likelihoodfloat: Log-likelihood of the data under the fitted Gaussian model.