NN-Descent#
Index build parameters#
- class cuvs.neighbors.nn_descent.IndexParams(
- metric=None,
- *,
- metric_arg=None,
- graph_degree=None,
- intermediate_graph_degree=None,
- max_iterations=None,
- termination_threshold=None,
- n_clusters=None,
Parameters to build NN-Descent Index
- Parameters:
- metricstr, default = “sqeuclidean”
String denoting the metric type. distribution of the newly added data.
- graph_degreeint
For an input dataset of dimensions (N, D), determines the final dimensions of the all-neighbors knn graph which turns out to be of dimensions (N, graph_degree)
- intermediate_graph_degreeint
Internally, nn-descent builds an all-neighbors knn graph of dimensions (N, intermediate_graph_degree) before selecting the final
graph_degree
neighbors. It’s recommended thatintermediate_graph_degree
>= 1.5 * graph_degree- max_iterationsint
The number of iterations that nn-descent will refine the graph for. More iterations produce a better quality graph at cost of performance
- termination_thresholdfloat
The delta at which nn-descent will terminate its iterations
- Attributes:
- graph_degree
- intermediate_graph_degree
- max_iterations
- metric
- metric_arg
- n_clusters
- termination_threshold
Index#
- class cuvs.neighbors.nn_descent.Index#
NN-Descent index object. This object stores the trained NN-Descent index, which can be used to get the NN-Descent graph and distances after building
- Attributes:
- graph
- trained
Index build#
- cuvs.neighbors.nn_descent.build(IndexParams index_params, dataset, graph=None, resources=None)[source]#
Build KNN graph from the dataset
- Parameters:
- index_params
cuvs.neighbors.nn_descent.IndexParams
- datasetArray interface compliant matrix, on either host or device memory
Supported dtype [float, int8, uint8]
- graphOptional host matrix for storing output graph
- resourcesOptional cuVS Resource handle for reusing CUDA resources.
If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- index_params
- Returns:
- index: py:class:
cuvs.neighbors.nn_descent.Index
- index: py:class:
Examples
>>> import cupy as cp >>> from cuvs.neighbors import nn_descent >>> n_samples = 50000 >>> n_features = 50 >>> n_queries = 1000 >>> k = 10 >>> dataset = cp.random.random_sample((n_samples, n_features), ... dtype=cp.float32) >>> build_params = nn_descent.IndexParams(metric="sqeuclidean") >>> index = nn_descent.build(build_params, dataset) >>> graph = index.graph