Stats#
This page provides C++ class references for the publicly-exposed elements of the cuvs/stats
package.
Silhouette Score#
#include <cuvs/stats/silhouette_score.hpp>
namespace cuvs::stats
-
float silhouette_score(raft::resources const &handle, raft::device_matrix_view<const float, int64_t, raft::row_major> X_in, raft::device_vector_view<const int, int64_t> labels, std::optional<raft::device_vector_view<float, int64_t>> silhouette_score_per_sample, int64_t n_unique_labels, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded)#
main function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X_in – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
metric – [in] Distance metric to use. Euclidean (L2) is used by default
- Returns:
: The silhouette score.
-
float silhouette_score_batched(raft::resources const &handle, raft::device_matrix_view<const float, int64_t, raft::row_major> X, raft::device_vector_view<const int, int64_t> labels, std::optional<raft::device_vector_view<float, int64_t>> silhouette_score_per_sample, int64_t n_unique_labels, int64_t batch_size, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded)#
function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
batch_size – [in] number of samples per batch
metric – [in] the numerical value that maps to the type of distance metric to be used in the calculations
- Returns:
: The silhouette score.
-
double silhouette_score(raft::resources const &handle, raft::device_matrix_view<const double, int64_t, raft::row_major> X_in, raft::device_vector_view<const int, int64_t> labels, std::optional<raft::device_vector_view<double, int64_t>> silhouette_score_per_sample, int64_t n_unique_labels, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded)#
main function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X_in – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
metric – [in] the numerical value that maps to the type of distance metric to be used in the calculations
- Returns:
: The silhouette score.
-
double silhouette_score_batched(raft::resources const &handle, raft::device_matrix_view<const double, int64_t, raft::row_major> X, raft::device_vector_view<const int, int64_t> labels, std::optional<raft::device_vector_view<double, int64_t>> silhouette_score_per_sample, int64_t n_unique_labels, int64_t batch_size, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded)#
function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
batch_size – [in] number of samples per batch
metric – [in] the numerical value that maps to the type of distance metric to be used in the calculations
- Returns:
: The silhouette score.
Trustworthiness Score#
#include <cuvs/stats/trustworthiness_score.hpp>
namespace cuvs::stats
-
double trustworthiness_score(raft::resources const &handle, raft::device_matrix_view<const float, int64_t, raft::row_major> X, raft::device_matrix_view<const float, int64_t, raft::row_major> X_embedded, int n_neighbors, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2SqrtUnexpanded, int batch_size = 512)#
Compute the trustworthiness score.
Note
The constness of the data in X_embedded is currently casted away and the data is slightly modified.
- Parameters:
handle – [in] the raft handle
X – [in] Data in original dimension
X_embedded – [in] Data in target dimension (embedding)
n_neighbors – [in] Number of neighbors considered by trustworthiness score
metric – [in] Distance metric to use. Euclidean (L2) is used by default
batch_size – [in] Batch size
- Returns:
Trustworthiness score