Attention

The vector search and clustering algorithms in RAFT are being migrated to a new library dedicated to vector search called cuVS. We will continue to support the vector search algorithms in RAFT during this move, but will no longer update them after the RAPIDS 24.06 (June) release. We plan to complete the migration by RAPIDS 24.08 (August) release.

Random Ball Cover#

#include <raft/neighbors/ball_cover.cuh>

namespace raft::neighbors::ball_cover

template<typename idx_t, typename value_t, typename int_t, typename matrix_idx_t>
void all_knn_query(raft::resources const &handle, BallCoverIndex<idx_t, value_t, int_t, matrix_idx_t> &index, raft::device_matrix_view<idx_t, matrix_idx_t, row_major> inds, raft::device_matrix_view<value_t, matrix_idx_t, row_major> dists, int_t k, bool perform_post_filtering = true, float weight = 1.0)#

Performs a faster exact knn in metric spaces using the triangle inequality with a number of landmark points to reduce the number of distance computations from O(n^2) to O(sqrt(n)). This performs an all neighbors knn, which can reuse memory when the index and query are the same array. This function will build the index and assumes rbc_build_index() has not already been called.

Usage example:

#include <raft/core/resources.hpp>
#include <raft/neighbors/ball_cover.cuh>
#include <raft/distance/distance_types.hpp>
using namespace raft::neighbors;

raft::resources handle;
...
auto metric = raft::distance::DistanceType::L2Expanded;

// Construct a ball cover index
BallCoverIndex index(handle, X, metric);

// Perform all neighbors knn query
ball_cover::all_knn_query(handle, index, inds, dists, k);

Template Parameters:
  • idx_t – knn index type

  • value_t – knn distance type

  • int_t – type for integers, such as number of rows/cols

  • matrix_idx_t – matrix indexing type

Parameters:
  • handle[in] raft handle for resource management

  • index[in] ball cover index which has not yet been built

  • inds[out] output knn indices

  • dists[out] output knn distances

  • k[in] number of nearest neighbors to find

  • perform_post_filtering[in] if this is false, only the closest k landmarks are considered (which will return approximate results).

  • weight[in] a weight for overlap between the closest landmark and the radius of other landmarks when pruning distances. Setting this value below 1 can effectively turn off computing distances against many other balls, enabling approximate nearest neighbors. Recall can be adjusted based on how many relevant balls are ignored. Note that many datasets can still have great recall even by only looking in the closest landmark.

template<typename idx_t, typename value_t, typename int_t, typename matrix_idx_t>
void knn_query(raft::resources const &handle, const BallCoverIndex<idx_t, value_t, int_t, matrix_idx_t> &index, raft::device_matrix_view<const value_t, matrix_idx_t, row_major> query, raft::device_matrix_view<idx_t, matrix_idx_t, row_major> inds, raft::device_matrix_view<value_t, matrix_idx_t, row_major> dists, int_t k, bool perform_post_filtering = true, float weight = 1.0)#

Performs a faster exact knn in metric spaces using the triangle inequality with a number of landmark points to reduce the number of distance computations from O(n^2) to O(sqrt(n)). This function does not build the index and assumes rbc_build_index() has already been called. Use this function when the index and query arrays are different, otherwise use rbc_all_knn_query().

Usage example:

#include <raft/core/resources.hpp>
#include <raft/neighbors/ball_cover.cuh>
#include <raft/distance/distance_types.hpp>
using namespace raft::neighbors;

raft::resources handle;
...
auto metric = raft::distance::DistanceType::L2Expanded;

// Build a ball cover index
BallCoverIndex index(handle, X, metric);
ball_cover::build_index(handle, index);

// Perform all neighbors knn query
ball_cover::knn_query(handle, index, inds, dists, k);

Template Parameters:
  • idx_t – index type

  • value_t – distances type

  • int_t – integer type for size info

  • matrix_idx_t

Parameters:
  • handle[in] raft handle for resource management

  • index[in] ball cover index which has not yet been built

  • query[in] device matrix containing query data points

  • inds[out] output knn indices

  • dists[out] output knn distances

  • k[in] number of nearest neighbors to find

  • perform_post_filtering[in] if this is false, only the closest k landmarks are considered (which will return approximate results).

  • weight[in] a weight for overlap between the closest landmark and the radius of other landmarks when pruning distances. Setting this value below 1 can effectively turn off computing distances against many other balls, enabling approximate nearest neighbors. Recall can be adjusted based on how many relevant balls are ignored. Note that many datasets can still have great recall even by only looking in the closest landmark.

template<typename idx_t, typename value_t, typename int_t, typename matrix_idx_t>
void build_index(raft::resources const &handle, BallCoverIndex<idx_t, value_t, int_t, matrix_idx_t> &index)#

Builds and populates a previously unbuilt BallCoverIndex

Usage example:

#include <raft/core/resources.hpp>
#include <raft/neighbors/ball_cover.cuh>
#include <raft/distance/distance_types.hpp>
using namespace raft::neighbors;

raft::resources handle;
...
auto metric = raft::distance::DistanceType::L2Expanded;
BallCoverIndex index(handle, X, metric);

ball_cover::build_index(handle, index);

Template Parameters:
  • idx_t – knn index type

  • value_t – knn value type

  • int_t – integral type for knn params

  • matrix_idx_t – matrix indexing type

Parameters:
  • handle[in] library resource management handle

  • index[inout] an empty (and not previous built) instance of BallCoverIndex

template<typename value_idx, typename value_t, typename value_int = std::int64_t, typename matrix_idx = std::int64_t>
class BallCoverIndex#
#include <ball_cover_types.hpp>

Stores raw index data points, sampled landmarks, the 1-nns of index points to their closest landmarks, and the ball radii of each landmark. This class is intended to be constructed once and reused across subsequent queries.

Template Parameters:
  • value_idx

  • value_t

  • value_int

Public Functions

inline explicit BallCoverIndex(raft::resources const &handle_, const value_t *X_, value_int m_, value_int n_, raft::distance::DistanceType metric_)#

the sqrt() here makes the sqrt(m)^2 a linear-time lower bound

Total memory footprint of index: (2 * sqrt(m)) + (n * sqrt(m)) + (2 * m)

inline explicit BallCoverIndex(raft::resources const &handle_, raft::device_matrix_view<const value_t, matrix_idx, row_major> X_, raft::distance::DistanceType metric_)#

the sqrt() here makes the sqrt(m)^2 a linear-time lower bound

Total memory footprint of index: (2 * sqrt(m)) + (n * sqrt(m)) + (2 * m)