Brute Force KNN#

Index#

class cuvs.neighbors.brute_force.Index#

Brute Force index object. This object stores the trained Brute Force which can be used to perform nearest neighbors searches.

Attributes:
trained

Index build#

cuvs.neighbors.brute_force.build(dataset, metric='sqeuclidean', metric_arg=2.0, resources=None)[source]#

Build the Brute Force index from the dataset for efficient search.

Parameters:
datasetCUDA array interface compliant matrix shape (n_samples, dim)

Supported dtype [float, int8, uint8]

metricDistance metric to use. Default is sqeuclidean
metric_argvalue of ‘p’ for Minkowski distances
resourcesOptional cuVS Resource handle for reusing CUDA resources.

If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling resources.sync() before accessing the output.

Returns:
index: cuvs.neighbors.brute_force.Index

Examples

>>> import cupy as cp
>>> from cuvs.neighbors import brute_force
>>> n_samples = 50000
>>> n_features = 50
>>> n_queries = 1000
>>> k = 10
>>> dataset = cp.random.random_sample((n_samples, n_features),
...                                   dtype=cp.float32)
>>> index = brute_force.build(dataset, metric="cosine")
>>> distances, neighbors = brute_force.search(index, dataset, k)
>>> distances = cp.asarray(distances)
>>> neighbors = cp.asarray(neighbors)

Index save#

cuvs.neighbors.brute_force.save(filename, Index index, bool include_dataset=True, resources=None)[source]#

Saves the index to a file.

The serialization format can be subject to changes, therefore loading an index saved with a previous version of cuvs is not guaranteed to work.

Parameters:
filenamestring

Name of the file.

indexIndex

Trained Brute Force index.

resourcesOptional cuVS Resource handle for reusing CUDA resources.

If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling resources.sync() before accessing the output.

Examples

>>> import cupy as cp
>>> from cuvs.neighbors import brute_force
>>> n_samples = 50000
>>> n_features = 50
>>> dataset = cp.random.random_sample((n_samples, n_features),
...                                   dtype=cp.float32)
>>> # Build index
>>> index = brute_force.build(dataset)
>>> # Serialize and deserialize the brute_force index built
>>> brute_force.save("my_index.bin", index)
>>> index_loaded = brute_force.load("my_index.bin")

Index load#

cuvs.neighbors.brute_force.load(filename, resources=None)[source]#

Loads index from file.

The serialization format can be subject to changes, therefore loading an index saved with a previous version of cuvs is not guaranteed to work.

Parameters:
filenamestring

Name of the file.

resourcesOptional cuVS Resource handle for reusing CUDA resources.

If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling resources.sync() before accessing the output.

Returns:
indexIndex

Examples

>>> import cupy as cp
>>> from cuvs.neighbors import brute_force
>>> n_samples = 50000
>>> n_features = 50
>>> dataset = cp.random.random_sample((n_samples, n_features),
...                                   dtype=cp.float32)
>>> # Build index
>>> index = brute_force.build(dataset)
>>> # Serialize and deserialize the brute_force index built
>>> brute_force.save("my_index.bin", index)
>>> index_loaded = brute_force.load("my_index.bin")