NN-Descent#

The NN-descent method is an ANN algorithm that directly approximates a k-nearest neighbors graph by randomly sampling points to compute distances and using neighbors of neighbors distances to reduce distance computations.

#include <cuvs/neighbors/nn_descent.hpp>

namespace cuvs::neighbors::nn_descent

Index build parameters#

struct index_params : public cuvs::neighbors::index_params#
#include <nn_descent.hpp>

Parameters used to build an nn-descent index.

graph_degree: For an input dataset of dimensions (N, D), determines the final dimensions of the all-neighbors knn graph which turns out to be of dimensions (N, graph_degree) intermediate_graph_degree: Internally, nn-descent builds an all-neighbors knn graph of dimensions (N, intermediate_graph_degree) before selecting the final graph_degree neighbors. It’s recommended that intermediate_graph_degree >= 1.5 * graph_degree max_iterations: The number of iterations that nn-descent will refine the graph for. More iterations produce a better quality graph at cost of performance termination_threshold: The delta at which nn-descent will terminate its iterations

Index#

template<typename IdxT>
struct index : public cuvs::neighbors::index#
#include <nn_descent.hpp>

nn-descent Build an nn-descent index The index contains an all-neighbors graph of the input dataset stored in host memory of dimensions (n_rows, n_cols)

Reference: Hui Wang, Wan-Lei Zhao, Xiangxiang Zeng, and Jianye Yang. 2021. Fast k-NN Graph Construction by GPU based NN-Descent. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM ‘21). Association for Computing Machinery, New York, NY, USA, 1929–1938. https://doi.org/10.1145/3459637.3482344

Template Parameters:

IdxT – dtype to be used for constructing knn-graph

Index build#

size_t graph_degree = 64#
size_t intermediate_graph_degree = 128#
size_t max_iterations = 20#
float termination_threshold = 0.0001#
bool return_distances = true#
size_t n_clusters = 1#
index_params(size_t graph_degree = 64, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded)#

Construct NN descent parameters for a specific kNN graph degree.

Parameters:
  • graph_degree – output graph degree

  • metric – distance metric to use

inline index(raft::resources const &res, int64_t n_rows, int64_t n_cols, bool return_distances = false, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded)#

Construct a new index object.

This constructor creates an nn-descent index which is a knn-graph in host memory. The type of the knn-graph is a dense raft::host_matrix and dimensions are (n_rows, n_cols).

Parameters:
  • res – raft::resources is an object mangaging resources

  • n_rows – number of rows in knn-graph

  • n_cols – number of cols in knn-graph

  • return_distances – whether to return distances

  • metric – distance metric to use

inline index(raft::resources const &res, raft::host_matrix_view<IdxT, int64_t, raft::row_major> graph_view, std::optional<raft::device_matrix_view<float, int64_t, row_major>> distances_view = std::nullopt, cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded)#

Construct a new index object.

This constructor creates an nn-descent index using a user allocated host memory knn-graph. The type of the knn-graph is a dense raft::host_matrix and dimensions are (n_rows, n_cols).

Parameters:
  • res – raft::resources is an object mangaging resources

  • graph_view – raft::host_matrix_view<IdxT, int64_t, raft::row_major> for storing knn-graph

  • distances_view – optional raft::device_matrix_view<float, int64_t, row_major> for storing distances

  • metric – distance metric to use

inline cuvs::distance::DistanceType metric() const noexcept#

Distance metric used for clustering.

inline IdxT size() const noexcept#

Total length of the index (number of vectors).

inline uint32_t graph_degree() const noexcept

Graph degree

inline raft::host_matrix_view<IdxT, int64_t, raft::row_major> graph() noexcept#

neighborhood graph [size, graph-degree]

inline std::optional<device_matrix_view<float, int64_t, row_major>> distances() noexcept#

neighborhood graph distances [size, graph-degree]

index(const index&) = delete#
index(index&&) = default#
index &=delete operator= (const index &)
index &=default operator= (index &&)
~index() = default#
cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::device_matrix_view<const float, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::host_matrix_view<const float, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors::experimental;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = cagra::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::device_matrix_view<const half, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::host_matrix_view<const half, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors::experimental;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = cagra::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors::experimental;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = cagra::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(raft::resources const &res, index_params const &params, raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset, std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2

Usage example:

using namespace cuvs::neighbors::experimental;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = cagra::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

bool has_enough_device_memory(raft::resources const &res, raft::matrix_extent<int64_t> dataset, size_t idx_size = 4)#

Test if we have enough GPU memory to run NN descent algorithm.

Parameters:
  • res

  • dataset – shape of the dataset

  • idx_size – the size of index type in bytes

Returns:

true if enough GPU memory can be allocated

Returns:

false otherwise