Stats#
This page provides C++ class references for the publicly-exposed elements of the cuvs/stats
package.
Silhouette Score#
#include <cuvs/stats/silhouette_score.hpp>
namespace cuvs::stats
- float silhouette_score(
- raft::resources const &handle,
- raft::device_matrix_view<const float, int64_t, raft::row_major> X_in,
- raft::device_vector_view<const int, int64_t> labels,
- std::optional<raft::device_vector_view<float, int64_t>> silhouette_score_per_sample,
- int64_t n_unique_labels,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
main function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X_in – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
metric – [in] Distance metric to use. Euclidean (L2) is used by default
- Returns:
: The silhouette score.
- float silhouette_score_batched(
- raft::resources const &handle,
- raft::device_matrix_view<const float, int64_t, raft::row_major> X,
- raft::device_vector_view<const int, int64_t> labels,
- std::optional<raft::device_vector_view<float, int64_t>> silhouette_score_per_sample,
- int64_t n_unique_labels,
- int64_t batch_size,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
batch_size – [in] number of samples per batch
metric – [in] the numerical value that maps to the type of distance metric to be used in the calculations
- Returns:
: The silhouette score.
- double silhouette_score(
- raft::resources const &handle,
- raft::device_matrix_view<const double, int64_t, raft::row_major> X_in,
- raft::device_vector_view<const int, int64_t> labels,
- std::optional<raft::device_vector_view<double, int64_t>> silhouette_score_per_sample,
- int64_t n_unique_labels,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
main function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X_in – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
metric – [in] the numerical value that maps to the type of distance metric to be used in the calculations
- Returns:
: The silhouette score.
- double silhouette_score_batched(
- raft::resources const &handle,
- raft::device_matrix_view<const double, int64_t, raft::row_major> X,
- raft::device_vector_view<const int, int64_t> labels,
- std::optional<raft::device_vector_view<double, int64_t>> silhouette_score_per_sample,
- int64_t n_unique_labels,
- int64_t batch_size,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
function that returns the average silhouette score for a given set of data and its clusterings
- Parameters:
handle – [in] raft handle for managing expensive resources
X – [in] input matrix Data in row-major format (nRows x nCols)
labels – [in] the pointer to the array containing labels for every data sample (length: nRows)
silhouette_score_per_sample – [out] optional array populated with the silhouette score for every sample (length: nRows)
n_unique_labels – [in] number of unique labels in the labels array
batch_size – [in] number of samples per batch
metric – [in] the numerical value that maps to the type of distance metric to be used in the calculations
- Returns:
: The silhouette score.
Trustworthiness Score#
#include <cuvs/stats/trustworthiness_score.hpp>
namespace cuvs::stats
- double trustworthiness_score(
- raft::resources const &handle,
- raft::device_matrix_view<const float, int64_t, raft::row_major> X,
- raft::device_matrix_view<const float, int64_t, raft::row_major> X_embedded,
- int n_neighbors,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2SqrtUnexpanded,
- int batch_size = 512,
Compute the trustworthiness score.
Note
The constness of the data in X_embedded is currently casted away and the data is slightly modified.
- Parameters:
handle – [in] the raft handle
X – [in] Data in original dimension
X_embedded – [in] Data in target dimension (embedding)
n_neighbors – [in] Number of neighbors considered by trustworthiness score
metric – [in] Distance metric to use. Euclidean (L2) is used by default
batch_size – [in] Batch size
- Returns:
Trustworthiness score