Multi-GPU IVF-PQ#

Multi-GPU IVF-PQ extends the IVF-PQ (Inverted File with Product Quantization) algorithm to work across multiple GPUs, providing improved scalability and performance for large-scale vector search. It supports both replicated and sharded distribution modes.

Note

IMPORTANT: Multi-GPU IVF-PQ requires all data (datasets, queries, output arrays) to be in host memory (CPU). If using CuPy/device arrays, transfer to host with array.get() or cp.asnumpy(array) before use.

Index build parameters#

Index search parameters#

Index#

Index build#

Index extend#

Index save#

Index load#

Index distribute#