Brute Force KNN#
Index#
- class cuvs.neighbors.brute_force.Index#
Brute Force index object. This object stores the trained Brute Force which can be used to perform nearest neighbors searches.
- Attributes:
- trained
Index build#
- cuvs.neighbors.brute_force.build(dataset, metric='sqeuclidean', metric_arg=2.0, resources=None)[source]#
Build the Brute Force index from the dataset for efficient search.
- Parameters:
- datasetCUDA array interface compliant matrix shape (n_samples, dim)
Supported dtype [float, int8, uint8]
- metricDistance metric to use. Default is sqeuclidean
- metric_argvalue of ‘p’ for Minkowski distances
- resourcesOptional cuVS Resource handle for reusing CUDA resources.
If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- Returns:
- index: cuvs.neighbors.brute_force.Index
Examples
>>> import cupy as cp >>> from cuvs.neighbors import brute_force >>> n_samples = 50000 >>> n_features = 50 >>> n_queries = 1000 >>> k = 10 >>> dataset = cp.random.random_sample((n_samples, n_features), ... dtype=cp.float32) >>> index = brute_force.build(dataset, metric="cosine") >>> distances, neighbors = brute_force.search(index, dataset, k) >>> distances = cp.asarray(distances) >>> neighbors = cp.asarray(neighbors)
Index search#
Index save#
- cuvs.neighbors.brute_force.save(filename, Index index, bool include_dataset=True, resources=None)[source]#
Saves the index to a file.
The serialization format can be subject to changes, therefore loading an index saved with a previous version of cuvs is not guaranteed to work.
- Parameters:
- filenamestring
Name of the file.
- indexIndex
Trained Brute Force index.
- resourcesOptional cuVS Resource handle for reusing CUDA resources.
If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
Examples
>>> import cupy as cp >>> from cuvs.neighbors import brute_force >>> n_samples = 50000 >>> n_features = 50 >>> dataset = cp.random.random_sample((n_samples, n_features), ... dtype=cp.float32) >>> # Build index >>> index = brute_force.build(dataset) >>> # Serialize and deserialize the brute_force index built >>> brute_force.save("my_index.bin", index) >>> index_loaded = brute_force.load("my_index.bin")
Index load#
- cuvs.neighbors.brute_force.load(filename, resources=None)[source]#
Loads index from file.
The serialization format can be subject to changes, therefore loading an index saved with a previous version of cuvs is not guaranteed to work.
- Parameters:
- filenamestring
Name of the file.
- resourcesOptional cuVS Resource handle for reusing CUDA resources.
If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- Returns:
- indexIndex
Examples
>>> import cupy as cp >>> from cuvs.neighbors import brute_force >>> n_samples = 50000 >>> n_features = 50 >>> dataset = cp.random.random_sample((n_samples, n_features), ... dtype=cp.float32) >>> # Build index >>> index = brute_force.build(dataset) >>> # Serialize and deserialize the brute_force index built >>> brute_force.save("my_index.bin", index) >>> index_loaded = brute_force.load("my_index.bin")