Multi-GPU IVF-Flat#
Multi-GPU IVF-Flat extends the IVF-Flat algorithm to work across multiple GPUs, providing improved scalability and performance for large-scale vector search. It supports both replicated and sharded distribution modes.
Note
IMPORTANT: Multi-GPU IVF-Flat requires all data (datasets, queries, output arrays) to be in host memory (CPU).
If using CuPy/device arrays, transfer to host with array.get()
or cp.asnumpy(array)
before use.