spectral_clustering#
- cuml.cluster.spectral_clustering(X, *, int n_clusters=8, random_state=None, n_components=None, n_neighbors=10, n_init=10, eigen_tol='auto', affinity='nearest_neighbors')[source]#
Apply clustering to a projection of the normalized Laplacian.
In practice Spectral Clustering is very useful when the structure of the individual clusters is highly non-convex or more generally when a measure of the center and spread of the cluster is not a suitable description of the complete cluster. For instance, when clusters are nested circles on the 2D plane.
If affinity is the adjacency matrix of a graph, this method can be used to find normalized graph cuts.
- Parameters:
- Xarray-like or sparse matrix of shape (n_samples, n_features) or (n_samples, n_samples)
If affinity is ‘nearest_neighbors’, this is the input data and a k-NN graph will be constructed. If affinity is ‘precomputed’, this is the affinity matrix. Supported formats for precomputed affinity: scipy sparse (CSR, CSC, COO), cupy sparse (CSR, CSC, COO), dense numpy arrays, or dense cupy arrays.
- n_clustersint, default=8
The number of clusters to form.
- random_stateint, RandomState instance or None, default=None
A pseudo random number generator used for the initialization of the k-means clustering and the eigendecomposition. Use an int to make the results deterministic across calls.
- n_componentsint or None, default=None
Number of eigenvectors to use for the spectral embedding. If None, defaults to n_clusters.
- n_neighborsint, default=10
Number of nearest neighbors for nearest_neighbors graph building. Only used when affinity=’nearest_neighbors’.
- n_initint, default=10
Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.
- eigen_tolfloat or ‘auto’, default=’auto’
Convergence tolerance passed to the eigensolver. If set to ‘auto’, a default value of currently 0.0 will be used.
- affinity{‘nearest_neighbors’, ‘precomputed’}, default=’nearest_neighbors’
- How to construct the affinity matrix.
‘nearest_neighbors’ : construct the affinity matrix by computing a graph of nearest neighbors.
‘precomputed’ : interpret
Aas a precomputed affinity matrix.
- Returns:
- labelscupy.ndarray or np.ndarray of shape (n_samples,)
Cluster labels for each sample.
Notes
The graph should contain only one connected component, otherwise the results make little sense.
This algorithm solves the normalized cut for k=2: it is a normalized spectral clustering.
Examples
>>> import numpy as np >>> from cuml.cluster import spectral_clustering >>> X = np.random.rand(100, 10).astype(np.float32) >>> labels = spectral_clustering(X, n_clusters=5, n_neighbors=10, random_state=42)