Multi-GPU Nearest Neighbors#

The Multi-GPU (SNMG - single-node multi-GPUs) C API provides a set of functions to deploy ANN indexes across multiple GPUs for improved performance and scalability.

Common Types and Enums#

Common types and enums used across multi-GPU ANN algorithms.

#include <cuvs/neighbors/mg_common.h>

enum cuvsMultiGpuDistributionMode#

Distribution mode for multi-GPU indexes.

Values:

enumerator CUVS_NEIGHBORS_MG_REPLICATED#

Index is replicated on each device, favors throughput

enumerator CUVS_NEIGHBORS_MG_SHARDED#

Index is split on several devices, favors scaling

enum cuvsMultiGpuReplicatedSearchMode#

Search mode when using a replicated index.

Values:

enumerator CUVS_NEIGHBORS_MG_LOAD_BALANCER#

Search queries are split to maintain equal load on GPUs

enumerator CUVS_NEIGHBORS_MG_ROUND_ROBIN#

Each search query is processed by a single GPU in a round-robin fashion

enum cuvsMultiGpuShardedMergeMode#

Merge mode when using a sharded index.

Values:

enumerator CUVS_NEIGHBORS_MG_MERGE_ON_ROOT_RANK#

Search batches are merged on the root rank

enumerator CUVS_NEIGHBORS_MG_TREE_MERGE#

Search batches are merged in a tree reduction fashion

Multi-GPU IVF-Flat#

The Multi-GPU IVF-Flat method extends the IVF-Flat ANN algorithm to work across multiple GPUs. It provides two distribution modes: replicated (for higher throughput) and sharded (for handling larger datasets).

#include <cuvs/neighbors/mg_ivf_flat.h>

IVF-Flat Index Build Parameters#

typedef struct cuvsMultiGpuIvfFlatIndexParams *cuvsMultiGpuIvfFlatIndexParams_t#
cuvsError_t cuvsMultiGpuIvfFlatIndexParamsCreate(
cuvsMultiGpuIvfFlatIndexParams_t *index_params
)#

Allocate Multi-GPU IVF-Flat Index params, and populate with default values.

Parameters:

index_params[in] cuvsMultiGpuIvfFlatIndexParams_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuIvfFlatIndexParamsDestroy(
cuvsMultiGpuIvfFlatIndexParams_t index_params
)#

De-allocate Multi-GPU IVF-Flat Index params.

Parameters:

index_params[in]

Returns:

cuvsError_t

struct cuvsMultiGpuIvfFlatIndexParams#
#include <mg_ivf_flat.h>

Multi-GPU parameters to build IVF-Flat Index.

This structure extends the base IVF-Flat index parameters with multi-GPU specific settings.

Public Members

cuvsIvfFlatIndexParams_t base_params#

Base IVF-Flat index parameters

cuvsMultiGpuDistributionMode mode#

Distribution mode for multi-GPU setup

IVF-Flat Index Search Parameters#

typedef struct cuvsMultiGpuIvfFlatSearchParams *cuvsMultiGpuIvfFlatSearchParams_t#
cuvsError_t cuvsMultiGpuIvfFlatSearchParamsCreate(
cuvsMultiGpuIvfFlatSearchParams_t *params
)#

Allocate Multi-GPU IVF-Flat search params, and populate with default values.

Parameters:

params[in] cuvsMultiGpuIvfFlatSearchParams_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuIvfFlatSearchParamsDestroy(
cuvsMultiGpuIvfFlatSearchParams_t params
)#

De-allocate Multi-GPU IVF-Flat search params.

Parameters:

params[in]

Returns:

cuvsError_t

struct cuvsMultiGpuIvfFlatSearchParams#
#include <mg_ivf_flat.h>

Multi-GPU parameters to search IVF-Flat index.

This structure extends the base IVF-Flat search parameters with multi-GPU specific settings.

Public Members

cuvsIvfFlatSearchParams_t base_params#

Base IVF-Flat search parameters

cuvsMultiGpuReplicatedSearchMode search_mode#

Replicated search mode

cuvsMultiGpuShardedMergeMode merge_mode#

Sharded merge mode

int64_t n_rows_per_batch#

Number of rows per batch

IVF-Flat Index#

typedef cuvsMultiGpuIvfFlatIndex *cuvsMultiGpuIvfFlatIndex_t#
cuvsError_t cuvsMultiGpuIvfFlatIndexCreate(
cuvsMultiGpuIvfFlatIndex_t *index
)#

Allocate Multi-GPU IVF-Flat index.

Parameters:

index[in] cuvsMultiGpuIvfFlatIndex_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuIvfFlatIndexDestroy(
cuvsMultiGpuIvfFlatIndex_t index
)#

De-allocate Multi-GPU IVF-Flat index.

Parameters:

index[in] cuvsMultiGpuIvfFlatIndex_t to de-allocate

Returns:

cuvsError_t

struct cuvsMultiGpuIvfFlatIndex#
#include <mg_ivf_flat.h>

Struct to hold address of cuvs::neighbors::mg_index<ivf_flat::index> and its active trained dtype.

IVF-Flat Index Build#

cuvsError_t cuvsMultiGpuIvfFlatBuild(
cuvsResources_t res,
cuvsMultiGpuIvfFlatIndexParams_t params,
DLManagedTensor *dataset_tensor,
cuvsMultiGpuIvfFlatIndex_t index
)#

Build a Multi-GPU IVF-Flat index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • params[in] Multi-GPU IVF-Flat index parameters

  • dataset_tensor[in] DLManagedTensor* training dataset

  • index[out] Multi-GPU IVF-Flat index

Returns:

cuvsError_t

IVF-Flat Index Extend#

cuvsError_t cuvsMultiGpuIvfFlatExtend(
cuvsResources_t res,
cuvsMultiGpuIvfFlatIndex_t index,
DLManagedTensor *new_vectors_tensor,
DLManagedTensor *new_indices_tensor
)#

Extend a Multi-GPU IVF-Flat index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • index[inout] Multi-GPU IVF-Flat index to extend

  • new_vectors_tensor[in] DLManagedTensor* new vectors to add

  • new_indices_tensor[in] DLManagedTensor* new indices (optional, can be NULL)

Returns:

cuvsError_t

IVF-Flat Index Serialize#

cuvsError_t cuvsMultiGpuIvfFlatSerialize(
cuvsResources_t res,
cuvsMultiGpuIvfFlatIndex_t index,
const char *filename
)#

Serialize a Multi-GPU IVF-Flat index to file.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • index[in] Multi-GPU IVF-Flat index to serialize

  • filename[in] Path to the output file

Returns:

cuvsError_t

IVF-Flat Index Deserialize#

cuvsError_t cuvsMultiGpuIvfFlatDeserialize(
cuvsResources_t res,
const char *filename,
cuvsMultiGpuIvfFlatIndex_t index
)#

Deserialize a Multi-GPU IVF-Flat index from file.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • filename[in] Path to the input file

  • index[out] Multi-GPU IVF-Flat index

Returns:

cuvsError_t

IVF-Flat Index Distribute#

cuvsError_t cuvsMultiGpuIvfFlatDistribute(
cuvsResources_t res,
const char *filename,
cuvsMultiGpuIvfFlatIndex_t index
)#

Distribute a local IVF-Flat index to create a Multi-GPU index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • filename[in] Path to the local index file

  • index[out] Multi-GPU IVF-Flat index

Returns:

cuvsError_t

Multi-GPU IVF-PQ#

The Multi-GPU IVF-PQ method extends the IVF-PQ ANN algorithm to work across multiple GPUs. It provides two distribution modes: replicated (for higher throughput) and sharded (for handling larger datasets).

#include <cuvs/neighbors/mg_ivf_pq.h>

IVF-PQ Index Build Parameters#

typedef struct cuvsMultiGpuIvfPqIndexParams *cuvsMultiGpuIvfPqIndexParams_t#
cuvsError_t cuvsMultiGpuIvfPqIndexParamsCreate(
cuvsMultiGpuIvfPqIndexParams_t *index_params
)#

Allocate Multi-GPU IVF-PQ Index params, and populate with default values.

Parameters:

index_params[in] cuvsMultiGpuIvfPqIndexParams_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuIvfPqIndexParamsDestroy(
cuvsMultiGpuIvfPqIndexParams_t index_params
)#

De-allocate Multi-GPU IVF-PQ Index params.

Parameters:

index_params[in]

Returns:

cuvsError_t

struct cuvsMultiGpuIvfPqIndexParams#
#include <mg_ivf_pq.h>

Multi-GPU parameters to build IVF-PQ Index.

This structure extends the base IVF-PQ index parameters with multi-GPU specific settings.

Public Members

cuvsIvfPqIndexParams_t base_params#

Base IVF-PQ index parameters

cuvsMultiGpuDistributionMode mode#

Distribution mode for multi-GPU setup

IVF-PQ Index Search Parameters#

typedef struct cuvsMultiGpuIvfPqSearchParams *cuvsMultiGpuIvfPqSearchParams_t#
cuvsError_t cuvsMultiGpuIvfPqSearchParamsCreate(
cuvsMultiGpuIvfPqSearchParams_t *params
)#

Allocate Multi-GPU IVF-PQ search params, and populate with default values.

Parameters:

params[in] cuvsMultiGpuIvfPqSearchParams_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuIvfPqSearchParamsDestroy(
cuvsMultiGpuIvfPqSearchParams_t params
)#

De-allocate Multi-GPU IVF-PQ search params.

Parameters:

params[in]

Returns:

cuvsError_t

struct cuvsMultiGpuIvfPqSearchParams#
#include <mg_ivf_pq.h>

Multi-GPU parameters to search IVF-PQ index.

This structure extends the base IVF-PQ search parameters with multi-GPU specific settings.

Public Members

cuvsIvfPqSearchParams_t base_params#

Base IVF-PQ search parameters

cuvsMultiGpuReplicatedSearchMode search_mode#

Replicated search mode

cuvsMultiGpuShardedMergeMode merge_mode#

Sharded merge mode

int64_t n_rows_per_batch#

Number of rows per batch

IVF-PQ Index#

typedef cuvsMultiGpuIvfPqIndex *cuvsMultiGpuIvfPqIndex_t#
cuvsError_t cuvsMultiGpuIvfPqIndexCreate(
cuvsMultiGpuIvfPqIndex_t *index
)#

Allocate Multi-GPU IVF-PQ index.

Parameters:

index[in] cuvsMultiGpuIvfPqIndex_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuIvfPqIndexDestroy(
cuvsMultiGpuIvfPqIndex_t index
)#

De-allocate Multi-GPU IVF-PQ index.

Parameters:

index[in] cuvsMultiGpuIvfPqIndex_t to de-allocate

Returns:

cuvsError_t

struct cuvsMultiGpuIvfPqIndex#
#include <mg_ivf_pq.h>

Struct to hold address of cuvs::neighbors::mg_index<ivf_pq::index> and its active trained dtype.

IVF-PQ Index Build#

cuvsError_t cuvsMultiGpuIvfPqBuild(
cuvsResources_t res,
cuvsMultiGpuIvfPqIndexParams_t params,
DLManagedTensor *dataset_tensor,
cuvsMultiGpuIvfPqIndex_t index
)#

Build a Multi-GPU IVF-PQ index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • params[in] Multi-GPU IVF-PQ index parameters

  • dataset_tensor[in] DLManagedTensor* training dataset

  • index[out] Multi-GPU IVF-PQ index

Returns:

cuvsError_t

IVF-PQ Index Extend#

cuvsError_t cuvsMultiGpuIvfPqExtend(
cuvsResources_t res,
cuvsMultiGpuIvfPqIndex_t index,
DLManagedTensor *new_vectors_tensor,
DLManagedTensor *new_indices_tensor
)#

Extend a Multi-GPU IVF-PQ index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • index[inout] Multi-GPU IVF-PQ index to extend

  • new_vectors_tensor[in] DLManagedTensor* new vectors to add

  • new_indices_tensor[in] DLManagedTensor* new indices (optional, can be NULL)

Returns:

cuvsError_t

IVF-PQ Index Serialize#

cuvsError_t cuvsMultiGpuIvfPqSerialize(
cuvsResources_t res,
cuvsMultiGpuIvfPqIndex_t index,
const char *filename
)#

Serialize a Multi-GPU IVF-PQ index to file.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • index[in] Multi-GPU IVF-PQ index to serialize

  • filename[in] Path to the output file

Returns:

cuvsError_t

IVF-PQ Index Deserialize#

cuvsError_t cuvsMultiGpuIvfPqDeserialize(
cuvsResources_t res,
const char *filename,
cuvsMultiGpuIvfPqIndex_t index
)#

Deserialize a Multi-GPU IVF-PQ index from file.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • filename[in] Path to the input file

  • index[out] Multi-GPU IVF-PQ index

Returns:

cuvsError_t

IVF-PQ Index Distribute#

cuvsError_t cuvsMultiGpuIvfPqDistribute(
cuvsResources_t res,
const char *filename,
cuvsMultiGpuIvfPqIndex_t index
)#

Distribute a local IVF-PQ index to create a Multi-GPU index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • filename[in] Path to the local index file

  • index[out] Multi-GPU IVF-PQ index

Returns:

cuvsError_t

Multi-GPU CAGRA#

The Multi-GPU CAGRA method extends the CAGRA graph-based ANN algorithm to work across multiple GPUs. It provides two distribution modes: replicated (for higher throughput) and sharded (for handling larger datasets).

#include <cuvs/neighbors/mg_cagra.h>

CAGRA Index Build Parameters#

typedef struct cuvsMultiGpuCagraIndexParams *cuvsMultiGpuCagraIndexParams_t#
cuvsError_t cuvsMultiGpuCagraIndexParamsCreate(
cuvsMultiGpuCagraIndexParams_t *index_params
)#

Allocate Multi-GPU CAGRA Index params, and populate with default values.

Parameters:

index_params[in] cuvsMultiGpuCagraIndexParams_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuCagraIndexParamsDestroy(
cuvsMultiGpuCagraIndexParams_t index_params
)#

De-allocate Multi-GPU CAGRA Index params.

Parameters:

index_params[in]

Returns:

cuvsError_t

struct cuvsMultiGpuCagraIndexParams#
#include <mg_cagra.h>

Multi-GPU parameters to build CAGRA Index.

This structure extends the base CAGRA index parameters with multi-GPU specific settings.

Public Members

cuvsCagraIndexParams_t base_params#

Base CAGRA index parameters

cuvsMultiGpuDistributionMode mode#

Distribution mode for multi-GPU setup

CAGRA Index Search Parameters#

typedef struct cuvsMultiGpuCagraSearchParams *cuvsMultiGpuCagraSearchParams_t#
cuvsError_t cuvsMultiGpuCagraSearchParamsCreate(
cuvsMultiGpuCagraSearchParams_t *params
)#

Allocate Multi-GPU CAGRA search params, and populate with default values.

Parameters:

params[in] cuvsMultiGpuCagraSearchParams_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuCagraSearchParamsDestroy(
cuvsMultiGpuCagraSearchParams_t params
)#

De-allocate Multi-GPU CAGRA search params.

Parameters:

params[in]

Returns:

cuvsError_t

struct cuvsMultiGpuCagraSearchParams#
#include <mg_cagra.h>

Multi-GPU parameters to search CAGRA index.

This structure extends the base CAGRA search parameters with multi-GPU specific settings.

Public Members

cuvsCagraSearchParams_t base_params#

Base CAGRA search parameters

cuvsMultiGpuReplicatedSearchMode search_mode#

Replicated search mode

cuvsMultiGpuShardedMergeMode merge_mode#

Sharded merge mode

int64_t n_rows_per_batch#

Number of rows per batch

CAGRA Index#

typedef cuvsMultiGpuCagraIndex *cuvsMultiGpuCagraIndex_t#
cuvsError_t cuvsMultiGpuCagraIndexCreate(
cuvsMultiGpuCagraIndex_t *index
)#

Allocate Multi-GPU CAGRA index.

Parameters:

index[in] cuvsMultiGpuCagraIndex_t to allocate

Returns:

cuvsError_t

cuvsError_t cuvsMultiGpuCagraIndexDestroy(
cuvsMultiGpuCagraIndex_t index
)#

De-allocate Multi-GPU CAGRA index.

Parameters:

index[in] cuvsMultiGpuCagraIndex_t to de-allocate

Returns:

cuvsError_t

struct cuvsMultiGpuCagraIndex#
#include <mg_cagra.h>

Struct to hold address of cuvs::neighbors::mg_index<cagra::index> and its active trained dtype.

CAGRA Index Build#

cuvsError_t cuvsMultiGpuCagraBuild(
cuvsResources_t res,
cuvsMultiGpuCagraIndexParams_t params,
DLManagedTensor *dataset_tensor,
cuvsMultiGpuCagraIndex_t index
)#

Build a Multi-GPU CAGRA index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • params[in] Multi-GPU CAGRA index parameters

  • dataset_tensor[in] DLManagedTensor* training dataset

  • index[out] Multi-GPU CAGRA index

Returns:

cuvsError_t

CAGRA Index Extend#

cuvsError_t cuvsMultiGpuCagraExtend(
cuvsResources_t res,
cuvsMultiGpuCagraIndex_t index,
DLManagedTensor *new_vectors_tensor,
DLManagedTensor *new_indices_tensor
)#

Extend a Multi-GPU CAGRA index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • index[inout] Multi-GPU CAGRA index to extend

  • new_vectors_tensor[in] DLManagedTensor* new vectors to add

  • new_indices_tensor[in] DLManagedTensor* new indices (optional, can be NULL)

Returns:

cuvsError_t

CAGRA Index Serialize#

cuvsError_t cuvsMultiGpuCagraSerialize(
cuvsResources_t res,
cuvsMultiGpuCagraIndex_t index,
const char *filename
)#

Serialize a Multi-GPU CAGRA index to file.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • index[in] Multi-GPU CAGRA index to serialize

  • filename[in] Path to the output file

Returns:

cuvsError_t

CAGRA Index Deserialize#

cuvsError_t cuvsMultiGpuCagraDeserialize(
cuvsResources_t res,
const char *filename,
cuvsMultiGpuCagraIndex_t index
)#

Deserialize a Multi-GPU CAGRA index from file.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • filename[in] Path to the input file

  • index[out] Multi-GPU CAGRA index

Returns:

cuvsError_t

CAGRA Index Distribute#

cuvsError_t cuvsMultiGpuCagraDistribute(
cuvsResources_t res,
const char *filename,
cuvsMultiGpuCagraIndex_t index
)#

Distribute a local CAGRA index to create a Multi-GPU index.

Parameters:
  • res[in] cuvsResources_t opaque C handle

  • filename[in] Path to the local index file

  • index[out] Multi-GPU CAGRA index

Returns:

cuvsError_t