approximate_predict#
- cuml.cluster.hdbscan.approximate_predict(clusterer, points_to_predict, convert_dtype=True)[source]#
Predict the cluster label of new points. The returned labels will be those of the original clustering found by
clusterer, and therefore are not (necessarily) the cluster labels that would be found by clustering the original data combined withpoints_to_predict, hence the ‘approximate’ label.If you simply wish to assign new points to an existing clustering in the ‘best’ way possible, this is the function to use. If you want to predict how
points_to_predictwould cluster with the original data under HDBSCAN the most efficient existing approach is to simply recluster with the new point(s) added to the original dataset.- Parameters:
- clustererHDBSCAN
A clustering object that has been fit to the data and had
prediction_data=Trueset.- points_to_predictarray, or array-like (n_samples, n_features)
The new data points to predict cluster labels for. They should have the same dimensionality as the original dataset over which clusterer was fit.
- Returns:
- labelsarray (n_samples,)
The predicted labels of the
points_to_predict- probabilitiesarray (n_samples,)
The soft cluster scores for each of the
points_to_predict