spectral_clustering#

cuml.cluster.spectral_clustering(X, *, int n_clusters=8, random_state=None, n_components=None, n_neighbors=10, n_init=10, eigen_tol='auto', affinity='nearest_neighbors')[source]#

Apply clustering to a projection of the normalized Laplacian.

In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance, when clusters are nested circles on the 2D plane.

If affinity is the adjacency matrix of a graph, this method can be used to find normalized graph cuts.

Parameters:
Xarray-like or sparse matrix of shape (n_samples, n_features) or (n_samples, n_samples)

If affinity is ‘nearest_neighbors’, this is the input data and a k-NN graph will be constructed. If affinity is ‘precomputed’, this is the affinity matrix. Supported formats for precomputed affinity: scipy sparse (CSR, CSC, COO), cupy sparse (CSR, CSC, COO), dense numpy arrays, or dense cupy arrays.

n_clustersint, default=8

The number of clusters to form.

random_stateint, RandomState instance or None, default=None

A pseudo random number generator used for the initialization of the k-means clustering and the eigendecomposition. Use an int to make the results deterministic across calls.

n_componentsint or None, default=None

Number of eigenvectors to use for the spectral embedding. If None, defaults to n_clusters.

n_neighborsint, default=10

Number of nearest neighbors for nearest_neighbors graph building. Only used when affinity=’nearest_neighbors’.

n_initint, default=10

Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

eigen_tolfloat or ‘auto’, default=’auto’

Convergence tolerance passed to the eigensolver. If set to ‘auto’, a default value of currently 0.0 will be used.

affinity{‘nearest_neighbors’, ‘precomputed’}, default=’nearest_neighbors’
How to construct the affinity matrix.
  • ‘nearest_neighbors’ : construct the affinity matrix by computing a graph of nearest neighbors.

  • ‘precomputed’ : interpret A as a precomputed affinity matrix.

Returns:
labelscupy.ndarray or np.ndarray of shape (n_samples,)

Cluster labels for each sample.

Notes

The graph should contain only one connected component, otherwise the results make little sense.

This algorithm solves the normalized cut for k=2: it is a normalized spectral clustering.

Examples

>>> import numpy as np
>>> from cuml.cluster import spectral_clustering
>>> X = np.random.rand(100, 10).astype(np.float32)
>>> labels = spectral_clustering(X, n_clusters=5, n_neighbors=10, random_state=42)