IVF-PQ#

Index build parameters#

class cuvs.neighbors.ivf_pq.IndexParams(n_lists=1024, *, metric='sqeuclidean', metric_arg=2.0, kmeans_n_iters=20, kmeans_trainset_fraction=0.5, pq_bits=8, pq_dim=0, codebook_kind='subspace', force_random_rotation=False, add_data_on_build=True, conservative_memory_allocation=False, max_train_points_per_pq_code=256)#

Parameters to build index for IvfPq nearest neighbor search

Parameters:
n_listsint, default = 1024

The number of clusters used in the coarse quantizer.

metricstr, default=”sqeuclidean”

String denoting the metric type. Valid values for metric: [“sqeuclidean”, “inner_product”, “euclidean”], where:

  • sqeuclidean is the euclidean distance without the square root operation, i.e.: distance(a,b) = sum_i (a_i - b_i)^2,

  • euclidean is the euclidean distance

  • inner product distance is defined as distance(a, b) = sum_i a_i * b_i.

kmeans_n_itersint, default = 20

The number of iterations searching for kmeans centers during index building.

kmeans_trainset_fractionint, default = 0.5

If kmeans_trainset_fraction is less than 1, then the dataset is subsampled, and only n_samples * kmeans_trainset_fraction rows are used for training.

pq_bitsint, default = 8

The bit length of the vector element after quantization.

pq_dimint, default = 0

The dimensionality of a the vector after product quantization. When zero, an optimal value is selected using a heuristic. Note pq_dim * pq_bits must be a multiple of 8. Hint: a smaller ‘pq_dim’ results in a smaller index size and better search performance, but lower recall. If ‘pq_bits’ is 8, ‘pq_dim’ can be set to any number, but multiple of 8 are desirable for good performance. If ‘pq_bits’ is not 8, ‘pq_dim’ should be a multiple of 8. For good performance, it is desirable that ‘pq_dim’ is a multiple of 32. Ideally, ‘pq_dim’ should be also a divisor of the dataset dim.

codebook_kindstring, default = “subspace”

Valid values [“subspace”, “cluster”]

force_random_rotationbool, default = False

Apply a random rotation matrix on the input data and queries even if dim % pq_dim == 0. Note: if dim is not multiple of pq_dim, a random rotation is always applied to the input data and queries to transform the working space from dim to rot_dim, which may be slightly larger than the original space and and is a multiple of pq_dim (rot_dim % pq_dim == 0). However, this transform is not necessary when dim is multiple of pq_dim (dim == rot_dim, hence no need in adding “extra” data columns / features). By default, if dim == rot_dim, the rotation transform is initialized with the identity matrix. When force_random_rotation == True, a random orthogonal transform matrix is generated regardless of the values of dim and pq_dim.

add_data_on_buildbool, default = True

After training the coarse and fine quantizers, we will populate the index with the dataset if add_data_on_build == True, otherwise the index is left empty, and the extend method can be used to add new vectors to the index.

conservative_memory_allocationbool, default = True

By default, the algorithm allocates more space than necessary for individual clusters (list_data). This allows to amortize the cost of memory allocation and reduce the number of data copies during repeated calls to extend (extending the database). To disable this behavior and use as little GPU memory for the database as possible, set this flat to True.

max_train_points_per_pq_codeint, default = 256

The max number of data points to use per PQ code during PQ codebook training. Using more data points per PQ code may increase the quality of PQ codebook but may also increase the build time. The parameter is applied to both PQ codebook generation methods, i.e., PER_SUBSPACE and PER_CLUSTER. In both cases, we will use pq_book_size * max_train_points_per_pq_code training points to train each codebook.

Attributes:
add_data_on_build
codebook_kind
conservative_memory_allocation
force_random_rotation
kmeans_n_iters
kmeans_trainset_fraction
max_train_points_per_pq_code
metric
metric_arg
n_lists
pq_bits
pq_dim

Index search parameters#

class cuvs.neighbors.ivf_pq.SearchParams(n_probes=20, *, lut_dtype=np.float32, internal_distance_dtype=np.float32)#

Supplemental parameters to search IVF-Pq index

Parameters:
n_probes: int

The number of clusters to search.

lut_dtype: default = np.float32

Data type of look up table to be created dynamically at search time. The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected. Possible values [np.float32, np.float16, np.uint8]

internal_distance_dtype: default = np.float32

Storage data type for distance/similarity computation. Possible values [np.float32, np.float16]

Attributes:
internal_distance_dtype
lut_dtype
n_probes

Index#

class cuvs.neighbors.ivf_pq.Index#

IvfPq index object. This object stores the trained IvfPq index state which can be used to perform nearest neighbors searches.

Attributes:
trained

Index build#

cuvs.neighbors.ivf_pq.build(IndexParams index_params, dataset, resources=None)[source]#

Build the IvfPq index from the dataset for efficient search.

Parameters:
index_paramscuvs.neighbors.ivf_pq.IndexParams

Parameters on how to build the index

datasetCUDA array interface compliant matrix shape (n_samples, dim)

Supported dtype [float, int8, uint8]

resourcesOptional cuVS Resource handle for reusing CUDA resources.

If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling resources.sync() before accessing the output.

Returns:
index: cuvs.neighbors.ivf_pq.Index

Examples

>>> import cupy as cp
>>> from cuvs.neighbors import ivf_pq
>>> n_samples = 50000
>>> n_features = 50
>>> n_queries = 1000
>>> k = 10
>>> dataset = cp.random.random_sample((n_samples, n_features),
...                                   dtype=cp.float32)
>>> build_params = ivf_pq.IndexParams(metric="sqeuclidean")
>>> index = ivf_pq.build(build_params, dataset)
>>> distances, neighbors = ivf_pq.search(ivf_pq.SearchParams(),
...                                        index, dataset,
...                                        k)
>>> distances = cp.asarray(distances)
>>> neighbors = cp.asarray(neighbors)

Index save#

cuvs.neighbors.ivf_pq.save(filename, Index index, bool include_dataset=True, resources=None)[source]#

Saves the index to a file.

Saving / loading the index is experimental. The serialization format is subject to change.

Parameters:
filenamestring

Name of the file.

indexIndex

Trained IVF-PQ index.

resourcesOptional cuVS Resource handle for reusing CUDA resources.

If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling resources.sync() before accessing the output.

Examples

>>> import cupy as cp
>>> from cuvs.neighbors import ivf_pq
>>> n_samples = 50000
>>> n_features = 50
>>> dataset = cp.random.random_sample((n_samples, n_features),
...                                   dtype=cp.float32)
>>> # Build index
>>> index = ivf_pq.build(ivf_pq.IndexParams(), dataset)
>>> # Serialize and deserialize the ivf_pq index built
>>> ivf_pq.save("my_index.bin", index)
>>> index_loaded = ivf_pq.load("my_index.bin")

Index load#

cuvs.neighbors.ivf_pq.load(filename, resources=None)[source]#

Loads index from file.

Saving / loading the index is experimental. The serialization format is subject to change, therefore loading an index saved with a previous version of cuvs is not guaranteed to work.

Parameters:
filenamestring

Name of the file.

resourcesOptional cuVS Resource handle for reusing CUDA resources.

If Resources aren’t supplied, CUDA resources will be allocated inside this function and synchronized before the function exits. If resources are supplied, you will need to explicitly synchronize yourself by calling resources.sync() before accessing the output.

Returns:
indexIndex