homogeneity_score#

cuml.metrics.cluster.homogeneity_score(labels_true, labels_pred) float[source]#

cython_homogeneity_score(labels_true, labels_pred) -> float

Computes the homogeneity metric of a cluster labeling given a ground truth.

A clustering result satisfies homogeneity if all of its clusters contain only data points which are members of a single class.

This metric is independent of the absolute values of the labels: a permutation of the class or cluster label values won’t change the score value in any way.

This metric is not symmetric: switching label_true with label_pred will return the completeness_score which will be different in general.

The labels in labels_pred and labels_true are assumed to be drawn from a contiguous set (Ex: drawn from {2, 3, 4}, but not from {2, 4}). If your set of labels looks like {2, 4}, convert them to something like {0, 1}.

Parameters:
labels_predarray-like (device or host) shape = (n_samples,)

The labels predicted by the model for the test dataset. Acceptable formats: cuDF DataFrame, NumPy ndarray, Numba device ndarray, cuda array interface compliant array like CuPy

labels_truearray-like (device or host) shape = (n_samples,)

The ground truth labels (ints) of the test dataset. Acceptable formats: cuDF DataFrame, NumPy ndarray, Numba device ndarray, cuda array interface compliant array like CuPy

Returns:
float

The homogeneity of the predicted labeling given the ground truth. Score between 0.0 and 1.0. 1.0 stands for perfectly homogeneous labeling.