NN-Descent#

The NN-descent method is an ANN algorithm that directly approximates a k-nearest neighbors graph by randomly sampling points to compute distances and using neighbors of neighbors distances to reduce distance computations.

#include <cuvs/neighbors/nn_descent.hpp>

namespace cuvs::neighbors::nn_descent

Index build parameters#

enum class DIST_COMP_DTYPE#

Dtype to use for distance computation.

  • AUTO: Automatically determine the best dtype for distance computation based on the dataset dimensions.

  • FP32: Use fp32 distance computation for better precision at the cost of performance and memory usage.

  • FP16: Use fp16 distance computation.

Values:

enumerator AUTO#
enumerator FP32#
enumerator FP16#
struct index_params : public cuvs::neighbors::index_params#
#include <nn_descent.hpp>

Parameters used to build an nn-descent index.

  • graph_degree: For an input dataset of dimensions (N, D), determines the final dimensions of the all-neighbors knn graph which turns out to be of dimensions (N, graph_degree)

  • intermediate_graph_degree: Internally, nn-descent builds an all-neighbors knn graph of dimensions (N, intermediate_graph_degree) before selecting the final graph_degree neighbors. It’s recommended that intermediate_graph_degree >= 1.5 * graph_degree

  • max_iterations: The number of iterations that nn-descent will refine the graph for. More iterations produce a better quality graph at cost of performance

  • termination_threshold: The delta at which nn-descent will terminate its iterations

  • return_distances: Boolean to decide whether to return distances array

  • dist_comp_dtype: dtype to use for distance computation. Defaults to AUTO which automatically determines the best dtype for distance computation based on the dataset dimensions. Use FP32 for better precision at the cost of performance and memory usage. This option is only valid when data type is fp32. Use FP16 for better performance and memory usage at the cost of precision.

Public Functions

index_params(
size_t graph_degree = 64,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
)#

Construct NN descent parameters for a specific kNN graph degree.

Parameters:
  • graph_degree – output graph degree

  • metric – distance metric to use

Index#

template<typename IdxT>
struct index : public cuvs::neighbors::index#
#include <nn_descent.hpp>

nn-descent Build an nn-descent index The index contains an all-neighbors graph of the input dataset stored in host memory of dimensions (n_rows, n_cols)

Reference: Hui Wang, Wan-Lei Zhao, Xiangxiang Zeng, and Jianye Yang. 2021. Fast k-NN Graph Construction by GPU based NN-Descent. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM ‘21). Association for Computing Machinery, New York, NY, USA, 1929–1938. https://doi.org/10.1145/3459637.3482344

Template Parameters:

IdxT – dtype to be used for constructing knn-graph

Public Functions

inline index(
raft::resources const &res,
int64_t n_rows,
int64_t n_cols,
bool return_distances = false,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
)#

Construct a new index object.

This constructor creates an nn-descent index which is a knn-graph in host memory. The type of the knn-graph is a dense raft::host_matrix and dimensions are (n_rows, n_cols).

Parameters:
  • res – raft::resources is an object mangaging resources

  • n_rows – number of rows in knn-graph

  • n_cols – number of cols in knn-graph

  • return_distances – whether to return distances

  • metric – distance metric to use

inline index(
raft::resources const &res,
raft::host_matrix_view<IdxT, int64_t, raft::row_major> graph_view,
std::optional<raft::device_matrix_view<float, int64_t, row_major>> distances_view = std::nullopt,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
)#

Construct a new index object.

This constructor creates an nn-descent index using a user allocated host memory knn-graph. The type of the knn-graph is a dense raft::host_matrix and dimensions are (n_rows, n_cols).

Parameters:
  • res – raft::resources is an object mangaging resources

  • graph_view – raft::host_matrix_view<IdxT, int64_t, raft::row_major> for storing knn-graph

  • distances_view – optional raft::device_matrix_view<float, int64_t, row_major> for storing distances

  • metric – distance metric to use

inline cuvs::distance::DistanceType metric() const noexcept#

Distance metric used for clustering.

inline IdxT size() const noexcept#

Total length of the index (number of vectors).

inline uint32_t graph_degree() const noexcept#

Graph degree

inline raft::host_matrix_view<IdxT, int64_t, raft::row_major> graph(
) noexcept#

neighborhood graph [size, graph-degree]

inline std::optional<device_matrix_view<float, int64_t, row_major>> distances(
) noexcept#

neighborhood graph distances [size, graph-degree]

Index build#

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::device_matrix_view<const float, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::host_matrix_view<const float, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::device_matrix_view<const half, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::host_matrix_view<const half, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

  • BitwiseHamming

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

  • BitwiseHamming

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in device memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

  • BitwiseHamming

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::device_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Parameters:
  • res[in] raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::device_matrix_view input dataset expected to be located in device memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory

cuvs::neighbors::nn_descent::index<uint32_t> build(
raft::resources const &res,
index_params const &params,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
)#

Build nn-descent Index with dataset in host memory.

The following distance metrics are supported:

  • L2Expanded

  • L2SqrtExpanded

  • CosineExpanded

  • InnerProduct

  • BitwiseHamming

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
nn_descent::index_params index_params;
// create and fill the index from a [N, D] raft::host_matrix_view dataset
auto index = nn_descent::build(res, index_params, dataset);
// index.graph() provides a raft::host_matrix_view of an
// all-neighbors knn graph of dimensions [N, k] of the input
// dataset

Template Parameters:
  • T – data-type of the input dataset

  • IdxT – data-type for the output index

Parameters:
  • res – raft::resources is an object mangaging resources

  • params[in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm

  • dataset[in] raft::host_matrix_view input dataset expected to be located in host memory

  • graph[in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph

Returns:

index<IdxT> index containing all-neighbors knn graph in host memory