NN-Descent#
The NN-descent method is an ANN algorithm that directly approximates a k-nearest neighbors graph by randomly sampling points to compute distances and using neighbors of neighbors distances to reduce distance computations.
#include <cuvs/neighbors/nn_descent.hpp>
namespace cuvs::neighbors::nn_descent
Index build parameters#
-
enum class DIST_COMP_DTYPE#
Dtype to use for distance computation.
AUTO: Automatically determine the best dtype for distance computation based on the dataset dimensions.FP32: Use fp32 distance computation for better precision at the cost of performance and memory usage.FP16: Use fp16 distance computation.
Values:
-
enumerator AUTO#
-
enumerator FP32#
-
enumerator FP16#
-
struct index_params : public cuvs::neighbors::index_params#
- #include <nn_descent.hpp>
Parameters used to build an nn-descent index.
graph_degree: For an input dataset of dimensions (N, D), determines the final dimensions of the all-neighbors knn graph which turns out to be of dimensions (N, graph_degree)intermediate_graph_degree: Internally, nn-descent builds an all-neighbors knn graph of dimensions (N, intermediate_graph_degree) before selecting the finalgraph_degreeneighbors. It’s recommended thatintermediate_graph_degree>= 1.5 * graph_degreemax_iterations: The number of iterations that nn-descent will refine the graph for. More iterations produce a better quality graph at cost of performancetermination_threshold: The delta at which nn-descent will terminate its iterationsreturn_distances: Boolean to decide whether to return distances arraydist_comp_dtype: dtype to use for distance computation. Defaults toAUTOwhich automatically determines the best dtype for distance computation based on the dataset dimensions. UseFP32for better precision at the cost of performance and memory usage. This option is only valid when data type is fp32. UseFP16for better performance and memory usage at the cost of precision.
Public Functions
- index_params(
- size_t graph_degree = 64,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
Construct NN descent parameters for a specific kNN graph degree.
- Parameters:
graph_degree – output graph degree
metric – distance metric to use
Index#
-
template<typename IdxT>
struct index : public cuvs::neighbors::index# - #include <nn_descent.hpp>
nn-descent Build an nn-descent index The index contains an all-neighbors graph of the input dataset stored in host memory of dimensions (n_rows, n_cols)
Reference: Hui Wang, Wan-Lei Zhao, Xiangxiang Zeng, and Jianye Yang. 2021. Fast k-NN Graph Construction by GPU based NN-Descent. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (CIKM ‘21). Association for Computing Machinery, New York, NY, USA, 1929–1938. https://doi.org/10.1145/3459637.3482344
- Template Parameters:
IdxT – dtype to be used for constructing knn-graph
Public Functions
- inline index(
- raft::resources const &res,
- int64_t n_rows,
- int64_t n_cols,
- bool return_distances = false,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
Construct a new index object.
This constructor creates an nn-descent index which is a knn-graph in host memory. The type of the knn-graph is a dense raft::host_matrix and dimensions are (n_rows, n_cols).
- Parameters:
res – raft::resources is an object mangaging resources
n_rows – number of rows in knn-graph
n_cols – number of cols in knn-graph
return_distances – whether to return distances
metric – distance metric to use
- inline index(
- raft::resources const &res,
- raft::host_matrix_view<IdxT, int64_t, raft::row_major> graph_view,
- std::optional<raft::device_matrix_view<float, int64_t, row_major>> distances_view = std::nullopt,
- cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
Construct a new index object.
This constructor creates an nn-descent index using a user allocated host memory knn-graph. The type of the knn-graph is a dense raft::host_matrix and dimensions are (n_rows, n_cols).
- Parameters:
res – raft::resources is an object mangaging resources
graph_view – raft::host_matrix_view<IdxT, int64_t, raft::row_major> for storing knn-graph
distances_view – optional raft::device_matrix_view<float, int64_t, row_major> for storing distances
metric – distance metric to use
-
inline cuvs::distance::DistanceType metric() const noexcept#
Distance metric used for clustering.
-
inline uint32_t graph_degree() const noexcept#
Graph degree
- inline raft::host_matrix_view<IdxT, int64_t, raft::row_major> graph(
neighborhood graph [size, graph-degree]
- inline std::optional<device_matrix_view<float, int64_t, row_major>> distances(
neighborhood graph distances [size, graph-degree]
Index build#
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::device_matrix_view<const float, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in device memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::device_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Parameters:
res – [in] raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::device_matrix_view input dataset expected to be located in device memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::host_matrix_view<const float, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in host memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::host_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Template Parameters:
T – data-type of the input dataset
IdxT – data-type for the output index
- Parameters:
res – raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::host_matrix_view input dataset expected to be located in host memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::device_matrix_view<const half, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in device memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::device_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Parameters:
res – [in] raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::device_matrix_view input dataset expected to be located in device memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::host_matrix_view<const half, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in host memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::host_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Template Parameters:
T – data-type of the input dataset
IdxT – data-type for the output index
- Parameters:
res – raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::host_matrix_view input dataset expected to be located in host memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in device memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
BitwiseHamming
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::device_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Parameters:
res – [in] raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::device_matrix_view input dataset expected to be located in device memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in host memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
BitwiseHamming
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::host_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Template Parameters:
T – data-type of the input dataset
IdxT – data-type for the output index
- Parameters:
res – raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::host_matrix_view input dataset expected to be located in host memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in device memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
BitwiseHamming
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::device_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Parameters:
res – [in] raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::device_matrix_view input dataset expected to be located in device memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory
- cuvs::neighbors::nn_descent::index<uint32_t> build(
- raft::resources const &res,
- index_params const ¶ms,
- raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
- std::optional<raft::host_matrix_view<uint32_t, int64_t, raft::row_major>> graph = std::nullopt
Build nn-descent Index with dataset in host memory.
The following distance metrics are supported:
L2Expanded
L2SqrtExpanded
CosineExpanded
InnerProduct
BitwiseHamming
Usage example:
using namespace cuvs::neighbors; // use default index parameters nn_descent::index_params index_params; // create and fill the index from a [N, D] raft::host_matrix_view dataset auto index = nn_descent::build(res, index_params, dataset); // index.graph() provides a raft::host_matrix_view of an // all-neighbors knn graph of dimensions [N, k] of the input // dataset
- Template Parameters:
T – data-type of the input dataset
IdxT – data-type for the output index
- Parameters:
res – raft::resources is an object mangaging resources
params – [in] an instance of nn_descent::index_params that are parameters to run the nn-descent algorithm
dataset – [in] raft::host_matrix_view input dataset expected to be located in host memory
graph – [in] optional raft::host_matrix_view<uint32_t, int64_t, raft::row_major> for owning the output graph
- Returns:
index<IdxT> index containing all-neighbors knn graph in host memory