SpectralEmbedding#

class cuml.manifold.SpectralEmbedding(n_components=2, affinity='nearest_neighbors', random_state=None, n_neighbors=None, verbose=False, output_type=None)#

Spectral embedding for non-linear dimensionality reduction.

Forms an affinity matrix given by the specified function and applies spectral decomposition to the corresponding graph laplacian. The resulting transformation is given by the value of the eigenvectors for each data point.

Note : Laplacian Eigenmaps is the actual algorithm implemented here.

Parameters:
n_componentsint, default=2

The dimension of the projected subspace.

affinity{‘nearest_neighbors’, ‘precomputed’}, default=’nearest_neighbors’
How to construct the affinity matrix.
  • ‘nearest_neighbors’ : construct the affinity matrix by computing a graph of nearest neighbors.

  • ‘precomputed’ : interpret X as a precomputed affinity matrix.

random_stateint, RandomState instance or None, default=None

A pseudo random number generator used for the initialization. Use an int to make the results deterministic across calls.

n_neighborsint or None, default=2

Number of nearest neighbors for nearest_neighbors graph building. If None, n_neighbors will be set to max(n_samples/10, 1).

verboseint or boolean, default=False

Sets logging level. It must be one of cuml.common.logger.level_*. See Verbosity Levels for more info.

output_type{‘input’, ‘array’, ‘dataframe’, ‘series’, ‘df_obj’, ‘numba’, ‘cupy’, ‘numpy’, ‘cudf’, ‘pandas’}, default=None

Return results and set estimator attributes to the indicated output type. If None, the output type set at the module level (cuml.global_settings.output_type) will be used. See Output Data Type Configuration for more info.

Attributes:
embedding_cupy.ndarray of shape (n_samples, n_components)

Spectral embedding of the training matrix.

n_neighbors_int

Number of nearest neighbors effectively used.

Methods

fit(self, X[, y])

Fit the model from data in X.

fit_transform(self, X[, y])

Fit the model from data in X and transform X.

Notes

Spectral Embedding (Laplacian Eigenmaps) is most useful when the graph has one connected component. If there graph has many components, the first few eigenvectors will simply uncover the connected components of the graph.

Examples

>>> import cupy as cp
>>> from cuml.manifold import SpectralEmbedding
>>> X = cp.random.rand(100, 20, dtype=cp.float32)
>>> embedding = SpectralEmbedding(n_components=2, random_state=42)
>>> X_transformed = embedding.fit_transform(X)
>>> X_transformed.shape
(100, 2)
fit(self, X, y=None) 'SpectralEmbedding'[source]#

Fit the model from data in X.

Parameters:
Xarray-like or sparse matrix of shape (n_samples, n_features) or (n_samples, n_samples)

Training vector, where n_samples is the number of samples and n_features is the number of features. If affinity is ‘precomputed’, X is the affinity matrix. Supported formats for precomputed affinity: scipy sparse (CSR, CSC, COO), cupy sparse (CSR, CSC, COO), dense numpy arrays, or dense cupy arrays.

yIgnored

Not used, present for API consistency by convention.

Returns:
selfobject

Returns the instance itself.

fit_transform(self, X, y=None) CumlArray[source]#

Fit the model from data in X and transform X.

Parameters:
Xarray-like or sparse matrix of shape (n_samples, n_features) or (n_samples, n_samples)

Training vector, where n_samples is the number of samples and n_features is the number of features. If affinity is ‘precomputed’, X is the affinity matrix. Supported formats for precomputed affinity: scipy sparse (CSR, CSC, COO), cupy sparse (CSR, CSC, COO), dense numpy arrays, or dense cupy arrays.

yIgnored

Not used, present for API consistency by convention.

Returns:
X_newcupy.ndarray of shape (n_samples, n_components)

Spectral embedding of the training matrix.