Cluster#
Params#
#include <cuvs/cluster/kmeans.hpp>
namespace cuvs::cluster::kmeans
-
struct params : public cuvs::cluster::kmeans::base_params#
- #include <kmeans.hpp>
Simple object to specify hyper-parameters to the kmeans algorithm.
Public Members
-
int n_clusters = 8#
The number of clusters to form as well as the number of centroids to generate (default:8).
-
InitMethod init = KMeansPlusPlus#
Method for initialization, defaults to k-means++:
InitMethod::KMeansPlusPlus (k-means++): Use scalable k-means++ algorithm to select the initial cluster centers.
InitMethod::Random (random): Choose ‘n_clusters’ observations (rows) at random from the input data for the initial centroids.
InitMethod::Array (ndarray): Use ‘centroids’ as initial cluster centers.
-
int max_iter = 300#
Maximum number of iterations of the k-means algorithm for a single run.
-
double tol = 1e-4#
Relative tolerance with regards to inertia to declare convergence.
-
int verbosity = RAFT_LEVEL_INFO#
verbosity level.
-
raft::random::RngState rng_state = {0}#
Seed to the random number generator.
-
int n_init = 1#
Number of instance k-means algorithm will be run with different seeds.
-
double oversampling_factor = 2.0#
Oversampling factor for use in the k-means|| algorithm
-
int batch_centroids = 0#
if 0 then batch_centroids = n_clusters
-
int n_clusters = 8#
-
struct balanced_params : public cuvs::cluster::kmeans::base_params#
- #include <kmeans.hpp>
Simple object to specify hyper-parameters to the balanced k-means algorithm.
The following metrics are currently supported in k-means balanced:
CosineExpanded
InnerProduct
L2Expanded
L2SqrtExpanded
Public Members
-
uint32_t n_iters = 20#
Number of training iterations
K-means#
include <cuvs/cluster/kmeans.hpp>
namespace cuvs::cluster::kmeans
- void fit(
- raft::resources const &handle,
- const cuvs::cluster::kmeans::params ¶ms,
- raft::device_matrix_view<const float, int> X,
- std::optional<raft::device_vector_view<const float, int>> sample_weight,
- raft::device_matrix_view<float, int> centroids,
- raft::host_scalar_view<float, int> inertia,
- raft::host_scalar_view<int, int> n_iter,
Find clusters with k-means algorithm. Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinitialized by choosing new centroids with k-means++ algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids, raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] [in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit(
- raft::resources const &handle,
- const cuvs::cluster::kmeans::params ¶ms,
- raft::device_matrix_view<const float, int64_t> X,
- std::optional<raft::device_vector_view<const float, int64_t>> sample_weight,
- raft::device_matrix_view<float, int64_t> centroids,
- raft::host_scalar_view<float, int64_t> inertia,
- raft::host_scalar_view<int64_t, int64_t> n_iter,
Find clusters with k-means algorithm. Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinitialized by choosing new centroids with k-means++ algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int64_t n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int64_t>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids, raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] [in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit(
- raft::resources const &handle,
- const cuvs::cluster::kmeans::params ¶ms,
- raft::device_matrix_view<const double, int> X,
- std::optional<raft::device_vector_view<const double, int>> sample_weight,
- raft::device_matrix_view<double, int> centroids,
- raft::host_scalar_view<double, int> inertia,
- raft::host_scalar_view<int, int> n_iter,
Find clusters with k-means algorithm. Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinitialized by choosing new centroids with k-means++ algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<double, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids, raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] [in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit(
- raft::resources const &handle,
- const cuvs::cluster::kmeans::params ¶ms,
- raft::device_matrix_view<const double, int64_t> X,
- std::optional<raft::device_vector_view<const double, int64_t>> sample_weight,
- raft::device_matrix_view<double, int64_t> centroids,
- raft::host_scalar_view<double, int64_t> inertia,
- raft::host_scalar_view<int64_t, int64_t> n_iter,
Find clusters with k-means algorithm. Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinitialized by choosing new centroids with k-means++ algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int64_t n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<double, int64_t>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids, raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] [in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit(
- raft::resources const &handle,
- const cuvs::cluster::kmeans::params ¶ms,
- raft::device_matrix_view<const int8_t, int> X,
- std::optional<raft::device_vector_view<const int8_t, int>> sample_weight,
- raft::device_matrix_view<int8_t, int> centroids,
- raft::host_scalar_view<int8_t, int> inertia,
- raft::host_scalar_view<int, int> n_iter,
Find clusters with k-means algorithm. Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinitialized by choosing new centroids with k-means++ algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids, raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] [in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit(
- const raft::resources &handle,
- cuvs::cluster::kmeans::balanced_params const ¶ms,
- raft::device_matrix_view<const float, int> X,
- raft::device_matrix_view<float, int> centroids,
Find balanced clusters with k-means algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::balanced_params params; int n_features = 15; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, centroids);
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
centroids – [out] [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
- void fit(
- const raft::resources &handle,
- cuvs::cluster::kmeans::balanced_params const ¶ms,
- raft::device_matrix_view<const int8_t, int> X,
- raft::device_matrix_view<int8_t, int> centroids,
Find balanced clusters with k-means algorithm.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::balanced_params params; int n_features = 15; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, centroids);
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
centroids – [inout] [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
- void predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const float, int> X,
- std::optional<raft::device_vector_view<const float, int>> sample_weight,
- raft::device_matrix_view<const float, int> centroids,
- raft::device_vector_view<int, int> labels,
- bool normalize_weight,
- raft::host_scalar_view<float> inertia,
Predict the closest cluster each sample in X belongs to.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter)); ... auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::predict(handle, params, X, std::nullopt, centroids.view(), false, labels.view(), raft::make_scalar_view(&ineratia));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] New data to predict. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
normalize_weight – [in] True if the weights should be normalized
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
- void predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const float, int> X,
- std::optional<raft::device_vector_view<const float, int>> sample_weight,
- raft::device_matrix_view<const float, int> centroids,
- raft::device_vector_view<int64_t, int> labels,
- bool normalize_weight,
- raft::host_scalar_view<float> inertia,
Predict the closest cluster each sample in X belongs to.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter)); ... auto labels = raft::make_device_vector<int64_t, int>(handle, X.extent(0)); kmeans::predict(handle, params, X, std::nullopt, centroids.view(), false, labels.view(), raft::make_scalar_view(&ineratia));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] New data to predict. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
normalize_weight – [in] True if the weights should be normalized
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
- void predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const double, int> X,
- std::optional<raft::device_vector_view<const double, int>> sample_weight,
- raft::device_matrix_view<const double, int> centroids,
- raft::device_vector_view<int, int> labels,
- bool normalize_weight,
- raft::host_scalar_view<double> inertia,
Predict the closest cluster each sample in X belongs to.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<double, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter)); ... auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::predict(handle, params, X, std::nullopt, centroids.view(), false, labels.view(), raft::make_scalar_view(&ineratia));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] New data to predict. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
normalize_weight – [in] True if the weights should be normalized
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
- void predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const double, int> X,
- std::optional<raft::device_vector_view<const double, int>> sample_weight,
- raft::device_matrix_view<const double, int> centroids,
- raft::device_vector_view<int64_t, int> labels,
- bool normalize_weight,
- raft::host_scalar_view<double> inertia,
Predict the closest cluster each sample in X belongs to.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<double, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter)); ... auto labels = raft::make_device_vector<int64_t, int>(handle, X.extent(0)); kmeans::predict(handle, params, X, std::nullopt, centroids.view(), false, labels.view(), raft::make_scalar_view(&ineratia));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] New data to predict. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
normalize_weight – [in] True if the weights should be normalized
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
- void predict(
- const raft::resources &handle,
- cuvs::cluster::kmeans::balanced_params const ¶ms,
- raft::device_matrix_view<const int8_t, int> X,
- raft::device_matrix_view<const float, int> centroids,
- raft::device_vector_view<uint32_t, int> labels,
Predict the closest cluster each sample in X belongs to.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); kmeans::fit(handle, params, X, std::nullopt, centroids.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter)); ... auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::predict(handle, params, X, std::nullopt, centroids.view(), false, labels.view(), raft::make_scalar_view(&ineratia));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] New data to predict. [dim = n_samples x n_features]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
- void fit_predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const float, int> X,
- std::optional<raft::device_vector_view<const float, int>> sample_weight,
- std::optional<raft::device_matrix_view<float, int>> centroids,
- raft::device_vector_view<int, int> labels,
- raft::host_scalar_view<float> inertia,
- raft::host_scalar_view<int> n_iter,
Compute k-means clustering and predicts cluster index for each sample in the input.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::fit_predict(handle, params, X, std::nullopt, centroids.view(), labels.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] Optional [in] When init is InitMethod::Array, use centroids as the initial cluster centers [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit_predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const float, int64_t> X,
- std::optional<raft::device_vector_view<const float, int64_t>> sample_weight,
- std::optional<raft::device_matrix_view<float, int64_t>> centroids,
- raft::device_vector_view<int64_t, int64_t> labels,
- raft::host_scalar_view<float> inertia,
- raft::host_scalar_view<int64_t> n_iter,
Compute k-means clustering and predicts cluster index for each sample in the input.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int64_t n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<float, int64_t>(handle, params.n_clusters, n_features); auto labels = raft::make_device_vector<int64_t, int64_t>(handle, X.extent(0)); kmeans::fit_predict(handle, params, X, std::nullopt, centroids.view(), labels.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] Optional [in] When init is InitMethod::Array, use centroids as the initial cluster centers [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit_predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const double, int> X,
- std::optional<raft::device_vector_view<const double, int>> sample_weight,
- std::optional<raft::device_matrix_view<double, int>> centroids,
- raft::device_vector_view<int, int> labels,
- raft::host_scalar_view<double> inertia,
- raft::host_scalar_view<int> n_iter,
Compute k-means clustering and predicts cluster index for each sample in the input.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<double, int>(handle, params.n_clusters, n_features); auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::fit_predict(handle, params, X, std::nullopt, centroids.view(), labels.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] Optional [in] When init is InitMethod::Array, use centroids as the initial cluster centers [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit_predict(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const double, int64_t> X,
- std::optional<raft::device_vector_view<const double, int64_t>> sample_weight,
- std::optional<raft::device_matrix_view<double, int64_t>> centroids,
- raft::device_vector_view<int64_t, int64_t> labels,
- raft::host_scalar_view<double> inertia,
- raft::host_scalar_view<int64_t> n_iter,
Compute k-means clustering and predicts cluster index for each sample in the input.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::params params; int64_t n_features = 15, inertia, n_iter; auto centroids = raft::make_device_matrix<double, int64_t>(handle, params.n_clusters, n_features); auto labels = raft::make_device_vector<int64_t, int64_t>(handle, X.extent(0)); kmeans::fit_predict(handle, params, X, std::nullopt, centroids.view(), labels.view(), raft::make_scalar_view(&inertia), raft::make_scalar_view(&n_iter));
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
sample_weight – [in] Optional weights for each observation in X. [len = n_samples]
centroids – [inout] Optional [in] When init is InitMethod::Array, use centroids as the initial cluster centers [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
inertia – [out] Sum of squared distances of samples to their closest cluster center.
n_iter – [out] Number of iterations run.
- void fit_predict(
- const raft::resources &handle,
- cuvs::cluster::kmeans::balanced_params const ¶ms,
- raft::device_matrix_view<const float, int> X,
- raft::device_matrix_view<float, int> centroids,
- raft::device_vector_view<uint32_t, int> labels,
Compute balanced k-means clustering and predicts cluster index for each sample in the input.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::balanced_params params; int n_features = 15; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::fit_predict(handle, params, X, centroids.view(), labels.view());
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
centroids – [inout] Optional [in] When init is InitMethod::Array, use centroids as the initial cluster centers [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
- void fit_predict(
- const raft::resources &handle,
- cuvs::cluster::kmeans::balanced_params const ¶ms,
- raft::device_matrix_view<const int8_t, int> X,
- raft::device_matrix_view<float, int> centroids,
- raft::device_vector_view<uint32_t, int> labels,
Compute balanced k-means clustering and predicts cluster index for each sample in the input.
#include <raft/core/resources.hpp> #include <cuvs/cluster/kmeans.hpp> using namespace cuvs::cluster; ... raft::resources handle; cuvs::cluster::kmeans::balanced_params params; int n_features = 15; auto centroids = raft::make_device_matrix<float, int>(handle, params.n_clusters, n_features); auto labels = raft::make_device_vector<int, int>(handle, X.extent(0)); kmeans::fit_predict(handle, params, X, centroids.view(), labels.view());
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format. [dim = n_samples x n_features]
centroids – [inout] Optional [in] When init is InitMethod::Array, use centroids as the initial cluster centers [out] The generated centroids from the kmeans algorithm are stored at the address pointed by ‘centroids’. [dim = n_clusters x n_features]
labels – [out] Index of the cluster each sample in X belongs to. [len = n_samples]
- void transform(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const float, int> X,
- raft::device_matrix_view<const float, int> centroids,
- raft::device_matrix_view<float, int> X_new,
Transform X to a cluster-distance space.
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format [dim = n_samples x n_features]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
X_new – [out] X transformed in the new space. [dim = n_samples x n_features]
- void transform(
- raft::resources const &handle,
- const kmeans::params ¶ms,
- raft::device_matrix_view<const double, int> X,
- raft::device_matrix_view<const double, int> centroids,
- raft::device_matrix_view<double, int> X_new,
Transform X to a cluster-distance space.
- Parameters:
handle – [in] The raft handle.
params – [in] Parameters for KMeans model.
X – [in] Training instances to cluster. The data must be in row-major format [dim = n_samples x n_features]
centroids – [in] Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features]
X_new – [out] X transformed in the new space. [dim = n_samples x n_features]
K-means Helpers#
include <cuvs/cluster/kmeans.hpp>
namespace cuvs::cluster::kmeans::helpers