Multi-GPU IVF-Flat#
Multi-GPU IVF-Flat extends the IVF-Flat algorithm to work across multiple GPUs, providing improved scalability and performance for large-scale vector search. It supports both replicated and sharded distribution modes.
Note
IMPORTANT: Multi-GPU IVF-Flat requires all data (datasets, queries, output arrays) to be in host memory (CPU).
If using CuPy/device arrays, transfer to host with array.get()
or cp.asnumpy(array)
before use.
Index build parameters#
- class cuvs.neighbors.mg.ivf_flat.IndexParams(distribution_mode='sharded', *, **kwargs)#
Parameters to build multi-GPU IVF-Flat index for efficient search. Extends single-GPU IndexParams with multi-GPU specific parameters.
- Parameters:
- distribution_modestr, default = “sharded”
Distribution mode for multi-GPU setup. Valid values: [“replicated”, “sharded”]
- **kwargsAdditional parameters passed to single-GPU IndexParams
- Attributes:
- distribution_mode
Methods
get_handle
(self)
Index search parameters#
- class cuvs.neighbors.mg.ivf_flat.SearchParams(
- n_probes=1,
- *,
- search_mode='load_balancer',
- merge_mode='merge_on_root_rank',
- n_rows_per_batch=1000,
- **kwargs,
Parameters to search multi-GPU IVF-Flat index.
- Attributes:
merge_mode
Get the merge mode for multi-GPU search.
n_rows_per_batch
Get the number of rows per batch for multi-GPU search.
search_mode
Get the search mode for multi-GPU search.
Methods
get_handle
(self)- merge_mode#
Get the merge mode for multi-GPU search.
- n_rows_per_batch#
Get the number of rows per batch for multi-GPU search.
- search_mode#
Get the search mode for multi-GPU search.
Index#
- class cuvs.neighbors.mg.ivf_flat.Index#
Multi-GPU IVF-Flat index object. Stores the trained multi-GPU IVF-Flat index state which can be used to perform nearest neighbors searches across multiple GPUs.
- Attributes:
- trained
Index build#
- cuvs.neighbors.mg.ivf_flat.build(IndexParams index_params, dataset, resources=None)[source]#
Build the multi-GPU IVF-Flat index from the dataset for efficient search.
- Parameters:
- index_params
cuvs.neighbors.ivf_flat.IndexParams
- datasetArray interface compliant matrix shape (n_samples, dim)
Supported dtype [float32, float16, int8, uint8] IMPORTANT: For multi-GPU IVF-Flat, the dataset MUST be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array).
- resourcesOptional cuVS Multi-GPU Resource handle for reusing CUDA resources.
If Multi-GPU Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- index_params
- Returns:
- index: py:class:
cuvs.neighbors.ivf_flat.Index
- index: py:class:
Examples
>>> import numpy as np >>> from cuvs.neighbors.mg import ivf_flat >>> n_samples = 50000 >>> n_features = 50 >>> n_queries = 1000 >>> k = 10 >>> # For multi-GPU IVF-Flat, use host (NumPy) arrays >>> dataset = np.random.random_sample((n_samples, n_features)).astype( ... np.float32) >>> build_params = ivf_flat.IndexParams(metric="sqeuclidean") >>> index = ivf_flat.build(build_params, dataset) >>> distances, neighbors = ivf_flat.search( ... ivf_flat.SearchParams(), ... index, dataset, k) >>> # Results are already in host memory (NumPy arrays)
Index search#
- cuvs.neighbors.mg.ivf_flat.search(
- SearchParams search_params,
- Index index,
- queries,
- k,
- neighbors=None,
- distances=None,
- resources=None,
Search the multi-GPU IVF-Flat index for the k-nearest neighbors of each query.
- Parameters:
- search_params
cuvs.neighbors.ivf_flat.SearchParams
- index
cuvs.neighbors.ivf_flat.Index
- queriesArray interface compliant matrix shape (n_queries, dim)
Supported dtype [float32, float16, int8, uint8] IMPORTANT: For multi-GPU IVF-Flat, queries MUST be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array).
- kint
The number of neighbors to search for each query.
- neighborsArray interface compliant matrix shape (n_queries, k), optional
If provided, this array will be filled with the indices of the k-nearest neighbors. If not provided, a new host array will be allocated. IMPORTANT: Must be in host memory (CPU) for multi-GPU IVF-Flat.
- distancesArray interface compliant matrix shape (n_queries, k), optional
If provided, this array will be filled with the distances to the k-nearest neighbors. If not provided, a new host array will be allocated. IMPORTANT: Must be in host memory (CPU) for multi-GPU IVF-Flat.
- resourcesOptional cuVS Multi-GPU Resource handle for reusing CUDA resources.
If Multi-GPU Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- search_params
- Returns:
- distancesnumpy.ndarray
The distances to the k-nearest neighbors for each query (in host memory).
- neighborsnumpy.ndarray
The indices of the k-nearest neighbors for each query (in host memory).
Examples
>>> import numpy as np >>> from cuvs.neighbors.mg import ivf_flat >>> n_samples = 50000 >>> n_features = 50 >>> n_queries = 1000 >>> k = 10 >>> # For multi-GPU IVF-Flat, use host (NumPy) arrays >>> dataset = np.random.random_sample((n_samples, n_features)).astype( ... np.float32) >>> queries = np.random.random_sample((n_queries, n_features)).astype( ... np.float32) >>> build_params = ivf_flat.IndexParams(metric="sqeuclidean") >>> index = ivf_flat.build(build_params, dataset) >>> distances, neighbors = ivf_flat.search( ... ivf_flat.SearchParams(), ... index, queries, k) >>> # Results are already in host memory (NumPy arrays)
Index extend#
- cuvs.neighbors.mg.ivf_flat.extend(Index index, new_vectors, new_indices=None, resources=None)[source]#
Extend the multi-GPU IVF-Flat index with new vectors.
- Parameters:
- index
cuvs.neighbors.ivf_flat.Index
- new_vectorsArray interface compliant matrix shape (n_new_vectors, dim)
Supported dtype [float32, float16, int8, uint8] IMPORTANT: For multi-GPU IVF-Flat, new_vectors MUST be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array).
- new_indicesArray interface compliant matrix shape (n_new_vectors,)
, optional If provided, these indices will be used for the new vectors. If not provided, indices will be automatically assigned. IMPORTANT: Must be in host memory (CPU) for multi-GPU IVF-Flat.
- resourcesOptional cuVS Multi-GPU Resource handle for reusing CUDA resources.
If Multi-GPU Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- index
Examples
>>> import numpy as np >>> from cuvs.neighbors.mg import ivf_flat >>> n_samples = 50000 >>> n_features = 50 >>> n_new_vectors = 1000 >>> # For multi-GPU IVF-Flat, use host (NumPy) arrays >>> dataset = np.random.random_sample((n_samples, n_features)).astype( ... np.float32) >>> new_vectors = np.random.random_sample( ... (n_new_vectors, n_features)).astype(np.float32) >>> new_indices = np.arange(n_samples, n_new_vectors, dtype=np.int64) >>> build_params = ivf_flat.IndexParams(metric="sqeuclidean") >>> index = ivf_flat.build(build_params, dataset) >>> ivf_flat.extend(index, new_vectors, new_indices)
Index save#
- cuvs.neighbors.mg.ivf_flat.save(Index index, filename, resources=None)[source]#
Serialize the multi-GPU IVF-Flat index to a file.
- Parameters:
- index
cuvs.neighbors.ivf_flat.Index
- filenamestr
The filename to serialize the index to.
- resourcesOptional cuVS Multi-GPU Resource handle for reusing CUDA resources.
If Multi-GPU Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- index
Examples
>>> import numpy as np >>> from cuvs.neighbors.mg import ivf_flat >>> n_samples = 50000 >>> n_features = 50 >>> # For multi-GPU IVF-Flat, use host (NumPy) arrays >>> dataset = np.random.random_sample((n_samples, n_features)).astype( ... np.float32) >>> build_params = ivf_flat.IndexParams(metric="sqeuclidean") >>> index = ivf_flat.build(build_params, dataset) >>> ivf_flat.save(index, "index.bin")
Index load#
- cuvs.neighbors.mg.ivf_flat.load(filename, resources=None)[source]#
Deserialize the multi-GPU IVF-Flat index from a file.
- Parameters:
- filenamestr
The filename to deserialize the index from.
- resourcesOptional cuVS Multi-GPU Resource handle for reusing CUDA resources.
If Multi-GPU Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- Returns:
- indexIndex
The deserialized index.
Examples
>>> from cuvs.neighbors.mg import ivf_flat >>> index = ivf_flat.load("index.bin")
Index distribute#
- cuvs.neighbors.mg.ivf_flat.distribute(filename, resources=None)[source]#
Distribute a single-GPU IVF-Flat index across multiple GPUs from a file.
- Parameters:
- filenamestr
The filename to distribute the index from.
- resourcesOptional cuVS Multi-GPU Resource handle for reusing CUDA resources.
If Multi-GPU Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling
resources.sync()
before accessing the output.
- Returns:
- indexIndex
The distributed index.
Examples
>>> from cuvs.neighbors.mg import ivf_flat >>> index = ivf_flat.distribute("single_gpu_index.bin")