IVF-PQ#

The IVF-PQ method is an ANN algorithm. Like IVF-Flat, IVF-PQ splits the points into a number of clusters (also specified by a parameter called n_lists) and searches the closest clusters to compute the nearest neighbors (also specified by a parameter called n_probes), but it shrinks the sizes of the vectors using a technique called product quantization.

#include <cuvs/neighbors/ivf_pq.hpp>

namespace cuvs::neighbors::ivf_pq

Index build parameters#

enum class codebook_gen#

A type for specifying how PQ codebooks are created.

Values:

enumerator PER_SUBSPACE#
enumerator PER_CLUSTER#
enum class list_layout#

A type for specifying the memory layout of PQ codes in IVF lists.

Values:

enumerator FLAT#

Flat layout: each vector’s PQ codes stored contiguously [n_rows, bytes_per_vector].

enumerator INTERLEAVED#

Interleaved layout: codes from multiple vectors interleaved for coalesced memory access.

struct index_params : public cuvs::neighbors::index_params#
#include <ivf_pq.hpp>

Public Members

uint32_t n_lists = 1024#

The number of inverted lists (clusters)

Hint: the number of vectors per cluster (n_rows/n_lists) should be approximately 1,000 to 10,000.

uint32_t kmeans_n_iters = 20#

The number of iterations searching for kmeans centers (index building).

double kmeans_trainset_fraction = 0.5#

The fraction of data to use during iterative kmeans building.

uint32_t pq_bits = 8#

The bit length of the vector element after compression by PQ.

Possible values: [4, 5, 6, 7, 8].

Hint: the smaller the ‘pq_bits’, the smaller the index size and the better the search performance, but the lower the recall.

uint32_t pq_dim = 0#

The dimensionality of the vector after compression by PQ. When zero, an optimal value is selected using a heuristic.

NB: pq_dim * pq_bits must be a multiple of 8.

Hint: a smaller ‘pq_dim’ results in a smaller index size and better search performance, but lower recall. If ‘pq_bits’ is 8, ‘pq_dim’ can be set to any number, but multiple of 8 are desirable for good performance. If ‘pq_bits’ is not 8, ‘pq_dim’ should be a multiple of 8. For good performance, it is desirable that ‘pq_dim’ is a multiple of 32. Ideally, ‘pq_dim’ should be also a divisor of the dataset dim.

codebook_gen codebook_kind = codebook_gen::PER_SUBSPACE#

How PQ codebooks are created.

list_layout codes_layout = list_layout::INTERLEAVED#

Memory layout of PQ codes in IVF lists.

  • INTERLEAVED (default): Codes from multiple vectors are interleaved for coalesced GPU memory access during search. This is optimized for search performance.

  • FLAT: Each vector’s PQ codes are stored contiguously.

bool force_random_rotation = false#

Apply a random rotation matrix on the input data and queries even if dim % pq_dim == 0.

Note: if dim is not multiple of pq_dim, a random rotation is always applied to the input data and queries to transform the working space from dim to rot_dim, which may be slightly larger than the original space and and is a multiple of pq_dim (rot_dim % pq_dim == 0). However, this transform is not necessary when dim is multiple of pq_dim (dim == rot_dim, hence no need in adding “extra” data columns / features).

By default, if dim == rot_dim, the rotation transform is initialized with the identity matrix. When force_random_rotation == true, a random orthogonal transform matrix is generated regardless of the values of dim and pq_dim.

bool conservative_memory_allocation = false#

By default, the algorithm allocates more space than necessary for individual clusters (list_data). This allows to amortize the cost of memory allocation and reduce the number of data copies during repeated calls to extend (extending the database).

The alternative is the conservative allocation behavior; when enabled, the algorithm always allocates the minimum amount of memory required to store the given number of records. Set this flag to true if you prefer to use as little GPU memory for the database as possible.

bool add_data_on_build = true#

Whether to add the dataset content to the index, i.e.:

  • true means the index is filled with the dataset vectors and ready to search after calling build.

  • false means build only trains the underlying model (e.g. quantizer or clustering), but the index is left empty; you’d need to call extend on the index afterwards to populate it.

uint32_t max_train_points_per_pq_code = 256#

The max number of data points to use per PQ code during PQ codebook training. Using more data points per PQ code may increase the quality of PQ codebook but may also increase the build time. The parameter is applied to both PQ codebook generation methods, i.e., PER_SUBSPACE and PER_CLUSTER. In both cases, we will use pq_book_size * max_train_points_per_pq_code training points to train each codebook.

Public Static Functions

static index_params from_dataset(
raft::matrix_extent<int64_t> dataset,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Expanded
)#

Creates index_params based on shape of the input dataset. Usage example:

using namespace cuvs::neighbors;
raft::resources res;
// create index_params for a [N. D] dataset and have InnerProduct as the distance metric
auto dataset = raft::make_device_matrix<float, int64_t>(res, N, D);
ivf_pq::index_params index_params =
  ivf_pq::index_params::from_dataset(dataset.extents(), cuvs::distance::InnerProduct);
// modify/update index_params as needed
index_params.add_data_on_build = true;

Index search parameters#

struct search_params : public cuvs::neighbors::search_params#
#include <ivf_pq.hpp>

Public Members

uint32_t n_probes = 20#

The number of clusters to search.

cudaDataType_t lut_dtype = CUDA_R_32F#

Data type of look up table to be created dynamically at search time.

Possible values: [CUDA_R_32F, CUDA_R_16F, CUDA_R_8U]

The use of low-precision types reduces the amount of shared memory required at search time, so fast shared memory kernels can be used even for datasets with large dimansionality. Note that the recall is slightly degraded when low-precision type is selected.

cudaDataType_t internal_distance_dtype = CUDA_R_32F#

Storage data type for distance/similarity computed at search time.

Possible values: [CUDA_R_16F, CUDA_R_32F]

If the performance limiter at search time is device memory access, selecting FP16 will improve performance slightly.

double preferred_shmem_carveout = 1.0#

Preferred fraction of SM’s unified memory / L1 cache to be used as shared memory.

Possible values: [0.0 - 1.0] as a fraction of the sharedMemPerMultiprocessor.

One wants to increase the carveout to make sure a good GPU occupancy for the main search kernel, but not to keep it too high to leave some memory to be used as L1 cache. Note, this value is interpreted only as a hint. Moreover, a GPU usually allows only a fixed set of cache configurations, so the provided value is rounded up to the nearest configuration. Refer to the NVIDIA tuning guide for the target GPU architecture.

Note, this is a low-level tuning parameter that can have drastic negative effects on the search performance if tweaked incorrectly.

cudaDataType_t coarse_search_dtype = CUDA_R_32F#

[Experimental] The data type to use as the GEMM element type when searching the clusters to probe.

Possible values: [CUDA_R_8I, CUDA_R_16F, CUDA_R_32F].

  • Legacy default: CUDA_R_32F (float)

  • Recommended for performance: CUDA_R_16F (half)

  • Experimental/low-precision: CUDA_R_8I (int8_t) (WARNING: int8_t variant degrades recall unless data is normalized and low-dimensional)

uint32_t max_internal_batch_size = 4096#

Set the internal batch size to improve GPU utilization at the cost of larger memory footprint.

Index#

template<typename IdxT>
class index : public cuvs::neighbors::ivf_pq::index_iface<IdxT>, private cuvs::neighbors::index#
#include <ivf_pq.hpp>

IVF-PQ index.

In the IVF-PQ index, a database vector y is approximated with two level quantization:

y = Q_1(y) + Q_2(y - Q_1(y))

The first level quantizer (Q_1), maps the vector y to the nearest cluster center. The number of clusters is n_lists.

The second quantizer encodes the residual, and it is defined as a product quantizer [1].

A product quantizer encodes a dim dimensional vector with a pq_dim dimensional vector. First we split the input vector into pq_dim subvectors (denoted by u), where each u vector contains pq_len distinct components of y

y_1, y_2, … y_{pq_len}, y_{pq_len+1}, … y_{2*pq_len}, … y_{dim-pq_len+1} … y_{dim} ___________________/ ____________________________/ ______________________/ u_1 u_2 u_{pq_dim}

Then each subvector encoded with a separate quantizer q_i, end the results are concatenated

Q_2(y) = q_1(u_1),q_2(u_2),…,q_{pq_dim}(u_pq_dim})

Each quantizer q_i outputs a code with pq_bit bits. The second level quantizers are also defined by k-means clustering in the corresponding sub-space: the reproduction values are the centroids, and the set of reproduction values is the codebook.

When the data dimensionality dim is not multiple of pq_dim, the feature space is transformed using a random orthogonal matrix to have rot_dim = pq_dim * pq_len dimensions (rot_dim >= dim).

The second-level quantizers are trained either for each subspace or for each cluster: (a) codebook_gen::PER_SUBSPACE: creates pq_dim second-level quantizers - one for each slice of the data along features; (b) codebook_gen::PER_CLUSTER: creates n_lists second-level quantizers - one for each first-level cluster. In either case, the centroids are again found using k-means clustering interpreting the data as having pq_len dimensions.

[1] Product quantization for nearest neighbor search Herve Jegou, Matthijs Douze, Cordelia Schmid

Template Parameters:

IdxT – type of the indices in the source dataset

Subclassed by cuvs::neighbors::ivf_pq::typed_index< T, IdxT >

Public Functions

index(raft::resources const &handle)#

Construct an empty index.

Constructs an empty index. This index will either need to be trained with build or loaded from a saved copy with deserialize

index(
raft::resources const &handle,
cuvs::distance::DistanceType metric,
codebook_gen codebook_kind,
uint32_t n_lists,
uint32_t dim,
uint32_t pq_bits = 8,
uint32_t pq_dim = 0,
bool conservative_memory_allocation = false
)#

Construct an index with specified parameters.

This constructor creates an owning index with the given parameters.

Parameters:
  • handle – RAFT resources handle

  • metric – Distance metric for clustering

  • codebook_kind – How PQ codebooks are created

  • n_lists – Number of inverted lists (clusters)

  • dim – Dimensionality of the input data

  • pq_bits – Bit length of vector elements after PQ compression

  • pq_dim – Dimensionality after PQ compression (0 = auto-select)

  • conservative_memory_allocation – Memory allocation strategy

index(
raft::resources const &handle,
const index_params &params,
uint32_t dim
)#

Construct an index from index parameters.

Parameters:
  • handle – RAFT resources handle

  • params – Index parameters

  • dim – Dimensionality of the input data

virtual IdxT size() const noexcept override#

Total length of the index.

virtual uint32_t dim() const noexcept override#

Dimensionality of the input data.

virtual uint32_t dim_ext() const noexcept#

Dimensionality of the cluster centers: input data dim extended with vector norms and padded to 8 elems.

virtual uint32_t rot_dim() const noexcept#

Dimensionality of the data after transforming it for PQ processing (rotated and augmented to be muplitple of pq_dim).

virtual uint32_t pq_bits() const noexcept override#

The bit length of an encoded vector element after compression by PQ.

virtual uint32_t pq_dim() const noexcept override#

The dimensionality of an encoded vector after compression by PQ.

virtual uint32_t pq_len() const noexcept#

Dimensionality of a subspace, i.e. the number of vector components mapped to a subspace

virtual uint32_t pq_book_size() const noexcept#

The number of vectors in a PQ codebook (1 << pq_bits).

virtual cuvs::distance::DistanceType metric() const noexcept override#

Distance metric used for clustering.

virtual codebook_gen codebook_kind() const noexcept override#

How PQ codebooks are created.

virtual list_layout codes_layout() const noexcept override#

Memory layout of PQ codes in IVF lists.

virtual uint32_t n_lists() const noexcept#

Number of clusters/inverted lists (first level quantization).

virtual bool conservative_memory_allocation() const noexcept override#

Whether to use convervative memory allocation when extending the list (cluster) data (see index_params.conservative_memory_allocation).

virtual raft::device_mdspan<const float, pq_centers_extents, raft::row_major> pq_centers(
) const noexcept override#

PQ cluster centers

virtual std::vector<std::shared_ptr<list_data_base<IdxT>>> &lists(
) noexcept override#

Lists’ data and indices (polymorphic, works for both FLAT and INTERLEAVED layouts).

virtual raft::device_vector_view<uint8_t*, uint32_t, raft::row_major> data_ptrs(
) noexcept override#

Pointers to the inverted lists (clusters) data [n_lists].

virtual raft::device_vector_view<IdxT*, uint32_t, raft::row_major> inds_ptrs(
) noexcept override#

Pointers to the inverted lists (clusters) indices [n_lists].

virtual raft::device_matrix_view<const float, uint32_t, raft::row_major> rotation_matrix(
) const noexcept override#

The transform matrix (original space -> rotated padded space) [rot_dim, dim]

virtual raft::host_vector_view<IdxT, uint32_t, raft::row_major> accum_sorted_sizes(
) noexcept override#

Accumulated list sizes, sorted in descending order [n_lists + 1]. The last value contains the total length of the index. The value at index zero is always zero.

That is, the content of this span is as if the list_sizes was sorted and then accumulated.

This span is used during search to estimate the maximum size of the workspace.

virtual raft::device_vector_view<uint32_t, uint32_t, raft::row_major> list_sizes(
) noexcept override#

Sizes of the lists [n_lists].

virtual raft::device_matrix_view<const float, uint32_t, raft::row_major> centers(
) const noexcept override#

Cluster centers corresponding to the lists in the original space [n_lists, dim_ext]

virtual raft::device_matrix_view<const float, uint32_t, raft::row_major> centers_rot(
) const noexcept override#

Cluster centers corresponding to the lists in the rotated space [n_lists, rot_dim]

virtual uint32_t get_list_size_in_bytes(
uint32_t label
) const override#

fetch size of a particular IVF list in bytes using the list extents. Usage example:

raft::resources res;
// use default index params
ivf_pq::index_params index_params;
// extend the IVF lists while building the index
index_params.add_data_on_build = true;
// create and fill the index from a [N, D] dataset
auto index = cuvs::neighbors::ivf_pq::build(res, index_params, dataset);
// Fetch the size of the fourth list
uint32_t size = index.get_list_size_in_bytes(3);

Parameters:

label[in] list ID

explicit index(std::unique_ptr<index_iface<IdxT>> impl)#

Construct index from implementation pointer.

This constructor is used internally by build/extend/deserialize functions.

Parameters:

impl – Implementation pointer (owning or view)

Index build#

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const float, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a device matrix view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const float, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::device_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const half, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a device matrix view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const half, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::device_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a device matrix view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::device_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a device matrix view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::device_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const float, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Note, if index_params.add_data_on_build is set to true, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping. This is only applicable if index_params.add_data_on_build is set to true
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a host_matrix_view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const float, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Note, if index_params.add_data_on_build is set to true, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping. This is only applicable if index_params.add_data_on_build is set to true
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::host_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const half, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Note, if index_params.add_data_on_build is set to true, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping. This is only applicable if index_params.add_data_on_build is set to true
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a host_matrix_view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const half, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::host_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a host_matrix_view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Note, if index_params.add_data_on_build is set to true, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping. This is only applicable if index_params.add_data_on_build is set to true
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::host_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset
)#

Build the index from the dataset for efficient search.

Note, if index_params.add_data_on_build is set to true, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping. This is only applicable if index_params.add_data_on_build is set to true
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] a host_matrix_view to a row-major matrix [n_rows, dim]

Returns:

the constructed ivf-pq index

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build the index from the dataset for efficient search.

NB: Currently, the following distance metrics are supported:

  • L2Expanded

  • L2Unexpanded

  • InnerProduct

  • CosineExpanded

Note, if index_params.add_data_on_build is set to true, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping. This is only applicable if index_params.add_data_on_build is set to true
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// create and fill the index from a [N, D] dataset
ivf_pq::index<decltype(dataset::value_type), decltype(dataset::index_type)> index;
ivf_pq::build(handle, index_params, dataset, index);

Parameters:
  • handle[in]

  • index_params – configure the index building

  • dataset[in] raft::host_matrix_view to a row-major matrix [n_rows, dim]

  • idx[out] reference to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
const uint32_t dim,
raft::device_mdspan<const float, raft::extent_3d<uint32_t>, raft::row_major> pq_centers,
raft::device_matrix_view<const float, uint32_t, raft::row_major> centers,
raft::device_matrix_view<const float, uint32_t, raft::row_major> centers_rot,
raft::device_matrix_view<const float, uint32_t, raft::row_major> rotation_matrix
)#

Build a view-type IVF-PQ index from device memory centroids and codebook.

This function creates a non-owning index that stores a reference to the provided device data. All parameters must be provided with correct extents. The caller is responsible for ensuring the lifetime of the input data exceeds the lifetime of the returned index.

The index_params must be consistent with the provided matrices. Specifically:

Parameters:
  • handle[in] raft resources handle

  • index_params – configure the index (metric, codebook_kind, etc.). Must be consistent with the provided matrices.

  • dim[in] dimensionality of the input data

  • pq_centers[in] PQ codebook on device memory with required extents:

  • centers[in] Cluster centers in the original space [n_lists, dim_ext] where dim_ext = round_up(dim + 1, 8)

  • centers_rot[in] Rotated cluster centers [n_lists, rot_dim] where rot_dim = pq_len * pq_dim

  • rotation_matrix[in] Transform matrix (original space -> rotated padded space) [rot_dim, dim]

Returns:

A view-type ivf_pq index that references the provided data

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
const uint32_t dim,
raft::device_mdspan<const float, raft::extent_3d<uint32_t>, raft::row_major> pq_centers,
raft::device_matrix_view<const float, uint32_t, raft::row_major> centers,
raft::device_matrix_view<const float, uint32_t, raft::row_major> centers_rot,
raft::device_matrix_view<const float, uint32_t, raft::row_major> rotation_matrix,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build an IVF-PQ index from device memory centroids and codebook.

This function creates a non-owning index that references the provided device data directly. All parameters must be provided with correct extents. The caller is responsible for ensuring the lifetime of the input data exceeds the lifetime of the returned index.

The index_params must be consistent with the provided matrices. Specifically:

Parameters:
  • handle[in] raft resources handle

  • index_params – configure the index (metric, codebook_kind, etc.). Must be consistent with the provided matrices.

  • dim[in] dimensionality of the input data

  • pq_centers[in] PQ codebook on device memory with required extents:

  • centers[in] Cluster centers in the original space [n_lists, dim_ext] where dim_ext = round_up(dim + 1, 8)

  • centers_rot[in] Rotated cluster centers [n_lists, rot_dim] where rot_dim = pq_len * pq_dim

  • rotation_matrix[in] Transform matrix (original space -> rotated padded space) [rot_dim, dim]

  • idx[out] pointer to ivf_pq::index

cuvs::neighbors::ivf_pq::index<int64_t> build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
const uint32_t dim,
raft::host_mdspan<const float, raft::extent_3d<uint32_t>, raft::row_major> pq_centers,
raft::host_matrix_view<const float, uint32_t, raft::row_major> centers,
std::optional<raft::host_matrix_view<const float, uint32_t, raft::row_major>> centers_rot,
std::optional<raft::host_matrix_view<const float, uint32_t, raft::row_major>> rotation_matrix
)#

Build an IVF-PQ index from host memory centroids and codebook (in-place).

Parameters:
  • handle[in] raft resources handle

  • index_params – configure the index building

  • dim[in] dimensionality of the input data

  • pq_centers[in] PQ codebook

  • centers[in] Cluster centers

  • centers_rot[in] Optional rotated cluster centers

  • rotation_matrix[in] Optional rotation matrix

void build(
raft::resources const &handle,
const cuvs::neighbors::ivf_pq::index_params &index_params,
const uint32_t dim,
raft::host_mdspan<const float, raft::extent_3d<uint32_t>, raft::row_major> pq_centers,
raft::host_matrix_view<const float, uint32_t, raft::row_major> centers,
std::optional<raft::host_matrix_view<const float, uint32_t, raft::row_major>> centers_rot,
std::optional<raft::host_matrix_view<const float, uint32_t, raft::row_major>> rotation_matrix,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Build an IVF-PQ index from host memory centroids and codebook (in-place).

Parameters:
  • handle[in] raft resources handle

  • index_params – configure the index building

  • dim[in] dimensionality of the input data

  • pq_centers[in] PQ codebook on host memory

  • centers[in] Cluster centers on host memory

  • centers_rot[in] Optional rotated cluster centers on host

  • rotation_matrix[in] Optional rotation matrix on host

  • idx[out] pointer to IVF-PQ index to be built

Index extend#

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::device_matrix_view<const float, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::device_matrix_view<const float, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::device_matrix_view<const half, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::device_matrix_view<const half, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::device_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// fill the index with the data
std::optional<raft::device_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a device matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a device vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::host_matrix_view<const float, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::host_matrix_view<const float, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::host_matrix_view<const half, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::host_matrix_view<const half, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

cuvs::neighbors::ivf_pq::index<int64_t> extend(
raft::resources const &handle,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
const cuvs::neighbors::ivf_pq::index<int64_t> &idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
auto index = ivf_pq::extend(handle, new_vectors, no_op, index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

void extend(
raft::resources const &handle,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> new_vectors,
std::optional<raft::host_vector_view<const int64_t, int64_t>> new_indices,
cuvs::neighbors::ivf_pq::index<int64_t> *idx
)#

Extend the index with the new data.

Note, the user can set a stream pool in the input raft::resource with at least one stream to enable kernel and copy overlapping.

Usage example:

using namespace cuvs::neighbors;
ivf_pq::index_params index_params;
index_params.add_data_on_build = false;      // don't populate index on build
index_params.kmeans_trainset_fraction = 1.0; // use whole dataset for kmeans training
// train the index from a [N, D] dataset
auto index_empty = ivf_pq::build(handle, index_params, dataset);
// optional: create a stream pool with at least one stream to enable kernel and copy
// overlapping
raft::resource::set_cuda_stream_pool(handle, std::make_shared<rmm::cuda_stream_pool>(1));
// fill the index with the data
std::optional<raft::host_vector_view<const IdxT, IdxT>> no_op = std::nullopt;
ivf_pq::extend(handle, new_vectors, no_op, &index_empty);

Parameters:
  • handle[in]

  • new_vectors[in] a host matrix view to a row-major matrix [n_rows, idx.dim()]

  • new_indices[in] a host vector view to a vector of indices [n_rows]. If the original index is empty (idx.size() == 0), you can pass std::nullopt here to imply a continuous range [0...n_rows).

  • idx[inout]

Index serialize#

void serialize(
raft::resources const &handle,
std::ostream &os,
const cuvs::neighbors::ivf_pq::index<int64_t> &index
)#

Write the index to an output stream

#include <raft/core/resources.hpp>

raft::resources handle;

// create an output stream
std::ostream os(std::cout.rdbuf());
// create an index with `auto index = ivf_pq::build(...);`
cuvs::neighbors::ivf_pq::serialize(handle, os, index);
Parameters:
  • handle[in] the raft handle

  • os[in] output stream

  • index[in] IVF-PQ index

void serialize(
raft::resources const &handle,
const std::string &filename,
const cuvs::neighbors::ivf_pq::index<int64_t> &index
)#

Save the index to file.

#include <raft/core/resources.hpp>

raft::resources handle;

// create a string with a filepath
std::string filename("/path/to/index");
// create an index with `auto index = ivf_pq::build(...);`
cuvs::neighbors::ivf_pq::serialize(handle, filename, index);
Parameters:
  • handle[in] the raft handle

  • filename[in] the file name for saving the index

  • index[in] IVF-PQ index

void deserialize(
raft::resources const &handle,
std::istream &str,
cuvs::neighbors::ivf_pq::index<int64_t> *index
)#

Load index from input stream

#include <raft/core/resources.hpp>

raft::resources handle;

// create an input stream
std::istream is(std::cin.rdbuf());

using IdxT = int64_t; // type of the index
// create an empty index
cuvs::neighbors::ivf_pq::index<IdxT> index(handle);

cuvs::neighbors::ivf_pq::deserialize(handle, is, index);
Parameters:
  • handle[in] the raft handle

  • str[in] the name of the file that stores the index

  • index[out] IVF-PQ index

void deserialize(
raft::resources const &handle,
const std::string &filename,
cuvs::neighbors::ivf_pq::index<int64_t> *index
)#

Load index from file.

#include <raft/core/resources.hpp>

raft::resources handle;

// create a string with a filepath
std::string filename("/path/to/index");
using IdxT = int64_t; // type of the index
// create an empty index
ivf_pq::index<IdxT> index(handle);

cuvs::neighbors::ivf_pq::deserialize(handle, filename, &index);
Parameters:
  • handle[in] the raft handle

  • filename[in] the name of the file that stores the index

  • index[out] IVF-PQ index

Helper Methods#

Additional helper functions for manipulating the underlying data of an IVF-PQ index, unpacking records, and writing PQ codes into an existing IVF list.

namespace cuvs::neighbors::ivf_pq::helpers

void unpack(
raft::resources const &res,
raft::device_mdspan<const uint8_t, list_spec_interleaved<uint32_t, uint32_t>::list_extents, raft::row_major> list_data,
uint32_t pq_bits,
uint32_t offset,
raft::device_matrix_view<uint8_t, uint32_t, raft::row_major> codes
)#

Unpack n_take consecutive records of a single list (cluster) in the compressed index starting at given offset.

Bit compression is removed, which means output will have pq_dim dimensional vectors (one code per byte, instead of ceildiv(pq_dim * pq_bits, 8) bytes of pq codes).

Usage example:

auto list_data = index.lists()[label]->data.view();
// allocate the buffer for the output
uint32_t n_take = 4;
auto codes = raft::make_device_matrix<uint8_t>(res, n_take, index.pq_dim());
uint32_t offset = 0;
// unpack n_take elements from the list
ivf_pq::helpers::codepacker::unpack(res, list_data, index.pq_bits(), offset, codes.view());

Parameters:
  • res[in] raft resource

  • list_data[in] block to read from

  • pq_bits[in] bit length of encoded vector elements

  • offset[in] How many records in the list to skip.

  • codes[out] the destination buffer [n_take, index.pq_dim()]. The length n_take defines how many records to unpack, it must be smaller than the list size.

void unpack_contiguous(
raft::resources const &res,
raft::device_mdspan<const uint8_t, list_spec_interleaved<uint32_t, uint32_t>::list_extents, raft::row_major> list_data,
uint32_t pq_bits,
uint32_t offset,
uint32_t n_rows,
uint32_t pq_dim,
uint8_t *codes
)#

Unpack n_rows consecutive records of a single list (cluster) in the compressed index starting at given offset. The output codes of a single vector are contiguous, not expanded to one code per byte, which means the output has ceildiv(pq_dim * pq_bits, 8) bytes per PQ encoded vector.

Usage example:

raft::resources res;
auto list_data = index.lists()[label]->data.view();
// allocate the buffer for the output
uint32_t n_rows = 4;
auto codes = raft::make_device_matrix<uint8_t>(
  res, n_rows, raft::ceildiv(index.pq_dim() * index.pq_bits(), 8));
uint32_t offset = 0;
// unpack n_rows elements from the list
ivf_pq::helpers::codepacker::unpack_contiguous(
  res, list_data, index.pq_bits(), offset, n_rows, index.pq_dim(), codes.data_handle());

Parameters:
  • res[in] raft resource

  • list_data[in] block to read from

  • pq_bits[in] bit length of encoded vector elements

  • offset[in] How many records in the list to skip.

  • n_rows[in] How many records to unpack

  • pq_dim[in] The dimensionality of the PQ compressed records

  • codes[out] the destination buffer [n_rows, ceildiv(pq_dim * pq_bits, 8)]. The length n_rows defines how many records to unpack, it must be smaller than the list size.

void pack(
raft::resources const &res,
raft::device_matrix_view<const uint8_t, uint32_t, raft::row_major> codes,
uint32_t pq_bits,
uint32_t offset,
raft::device_mdspan<uint8_t, list_spec_interleaved<uint32_t, uint32_t>::list_extents, raft::row_major> list_data
)#

Write flat PQ codes into an existing list by the given offset.

NB: no memory allocation happens here; the list must fit the data (offset + n_vec).

Usage example:

auto list_data  = index.lists()[label]->data.view();
// allocate the buffer for the input codes
auto codes = raft::make_device_matrix<uint8_t>(res, n_vec, index.pq_dim());
... prepare n_vecs to pack into the list in codes ...
// write codes into the list starting from the 42nd position
ivf_pq::helpers::codepacker::pack(
    res, make_const_mdspan(codes.view()), index.pq_bits(), 42, list_data);

Parameters:
  • res[in] raft resource

  • codes[in] flat PQ codes, one code per byte [n_vec, pq_dim]

  • pq_bits[in] bit length of encoded vector elements

  • offset[in] how many records to skip before writing the data into the list

  • list_data[in] block to write into

void pack_contiguous(
raft::resources const &res,
const uint8_t *codes,
uint32_t n_rows,
uint32_t pq_dim,
uint32_t pq_bits,
uint32_t offset,
raft::device_mdspan<uint8_t, list_spec_interleaved<uint32_t, uint32_t>::list_extents, raft::row_major> list_data
)#

Write flat PQ codes into an existing list by the given offset. The input codes of a single vector are contiguous (not expanded to one code per byte).

NB: no memory allocation happens here; the list must fit the data (offset + n_rows records).

Usage example:

raft::resources res;
auto list_data  = index.lists()[label]->data.view();
// allocate the buffer for the input codes
auto codes = raft::make_device_matrix<uint8_t>(
  res, n_rows, raft::ceildiv(index.pq_dim() * index.pq_bits(), 8));
... prepare compressed vectors to pack into the list in codes ...
// write codes into the list starting from the 42nd position. If the current size of the list
// is greater than 42, this will overwrite the codes starting at this offset.
ivf_pq::helpers::codepacker::pack_contiguous(
  res, codes.data_handle(), n_rows, index.pq_dim(), index.pq_bits(), 42, list_data);

Parameters:
  • res[in] raft resource

  • codes[in] flat PQ codes, [n_vec, ceildiv(pq_dim * pq_bits, 8)]

  • n_rows[in] number of records

  • pq_dim[in]

  • pq_bits[in] bit length of encoded vector elements

  • offset[in] how many records to skip before writing the data into the list

  • list_data[in] block to write into

void pack_list_data(
raft::resources const &res,
index<int64_t> *index,
raft::device_matrix_view<const uint8_t, uint32_t, raft::row_major> codes,
uint32_t label,
uint32_t offset
)#

Write flat PQ codes into an existing list by the given offset.

The list is identified by its label.

NB: no memory allocation happens here; the list must fit the data (offset + n_vec).

Usage example:

// We will write into the 137th cluster
uint32_t label = 137;
// allocate the buffer for the input codes
auto codes = raft::make_device_matrix<const uint8_t>(res, n_vec, index.pq_dim());
... prepare n_vecs to pack into the list in codes ...
// write codes into the list starting from the 42nd position
ivf_pq::helpers::codepacker::pack_list_data(res, &index, codes_to_pack, label, 42);

Parameters:
  • res[in] raft resource

  • index[inout] IVF-PQ index.

  • codes[in] flat PQ codes, one code per byte [n_rows, pq_dim]

  • label[in] The id of the list (cluster) into which we write.

  • offset[in] how many records to skip before writing the data into the list

void pack_contiguous_list_data(
raft::resources const &res,
index<int64_t> *index,
uint8_t *codes,
uint32_t n_rows,
uint32_t label,
uint32_t offset
)#

Write flat PQ codes into an existing list by the given offset. Use this when the input vectors are PQ encoded and not expanded to one code per byte.

The list is identified by its label.

NB: no memory allocation happens here; the list into which the vectors are packed must fit offset

  • n_rows rows.

Usage example:

using namespace cuvs::neighbors;
raft::resources res;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(res, index_params, dataset, N, D);
// allocate the buffer for n_rows input codes. Each vector occupies
// raft::ceildiv(index.pq_dim() * index.pq_bits(), 8) bytes because
// codes are compressed and without gaps.
auto codes = raft::make_device_matrix<const uint8_t>(
  res, n_rows, raft::ceildiv(index.pq_dim() * index.pq_bits(), 8));
... prepare the compressed vectors to pack into the list in codes ...
// the first n_rows codes in the fourth IVF list are to be overwritten.
uint32_t label = 3;
// write codes into the list starting from the 0th position
ivf_pq::helpers::codepacker::pack_contiguous_list_data(
  res, &index, codes.data_handle(), n_rows, label, 0);

Parameters:
  • res[in] raft resource

  • index[inout] pointer to IVF-PQ index

  • codes[in] flat contiguous PQ codes [n_rows, ceildiv(pq_dim * pq_bits, 8)]

  • n_rows[in] how many records to pack

  • label[in] The id of the list (cluster) into which we write.

  • offset[in] how many records to skip before writing the data into the list

void unpack_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_matrix_view<uint8_t, uint32_t, raft::row_major> out_codes,
uint32_t label,
uint32_t offset
)#

Unpack n_take consecutive records of a single list (cluster) in the compressed index starting at given offset, one code per byte (independently of pq_bits).

Usage example:

  // We will unpack the fourth cluster
  uint32_t label = 3;
  // Get the list size
  uint32_t list_size = 0;
  raft::copy(&list_size, index.list_sizes().data_handle() + label, 1,
resource::get_cuda_stream(res)); resource::sync_stream(res);
  // allocate the buffer for the output
  auto codes = raft::make_device_matrix<uint8_t>(res, list_size, index.pq_dim());
  // unpack the whole list
  ivf_pq::helpers::codepacker::unpack_list_data(res, index, codes.view(), label, 0);

Parameters:
  • res[in]

  • index[in]

  • out_codes[out] the destination buffer [n_take, index.pq_dim()]. The length n_take defines how many records to unpack, it must be smaller than the list size.

  • label[in] The id of the list (cluster) to decode.

  • offset[in] How many records in the list to skip.

void unpack_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_vector_view<const uint32_t> in_cluster_indices,
raft::device_matrix_view<uint8_t, uint32_t, raft::row_major> out_codes,
uint32_t label
)#

Unpack a series of records of a single list (cluster) in the compressed index by their in-list offsets, one code per byte (independently of pq_bits).

Usage example:

// We will unpack the fourth cluster
uint32_t label = 3;
// Create the selection vector
auto selected_indices = raft::make_device_vector<uint32_t>(res, 4);
... fill the indices ...
resource::sync_stream(res);
// allocate the buffer for the output
auto codes = raft::make_device_matrix<float>(res, selected_indices.size(), index.pq_dim());
// decode the whole list
ivf_pq::helpers::codepacker::unpack_list_data(
    res, index, selected_indices.view(), codes.view(), label);

Parameters:
  • res[in] raft resource

  • index[in] IVF-PQ index (passed by reference)

  • in_cluster_indices[in] The offsets of the selected indices within the cluster.

  • out_codes[out] the destination buffer [n_take, index.pq_dim()]. The length n_take defines how many records to unpack, it must be smaller than the list size.

  • label[in] The id of the list (cluster) to decode.

void unpack_contiguous_list_data(
raft::resources const &res,
const index<int64_t> &index,
uint8_t *out_codes,
uint32_t n_rows,
uint32_t label,
uint32_t offset
)#

Unpack n_rows consecutive PQ encoded vectors of a single list (cluster) in the compressed index starting at given offset, not expanded to one code per byte. Each code in the output buffer occupies ceildiv(index.pq_dim() * index.pq_bits(), 8) bytes.

Usage example:

raft::resources res;
// We will unpack the whole fourth cluster
uint32_t label = 3;
// Get the list size
uint32_t list_size = 0;
raft::update_host(&list_size, index.list_sizes().data_handle() + label, 1,
  raft::resource::get_cuda_stream(res));
raft::resource::sync_stream(res);
// allocate the buffer for the output
auto codes = raft::make_device_matrix<uint8_t>(res, list_size, raft::ceildiv(index.pq_dim() *
   index.pq_bits(), 8));
// unpack the whole list
ivf_pq::helpers::codepacker::unpack_contiguous_list_data(res, index, codes.data_handle(),
   list_size, label, 0);

Parameters:
  • res[in] raft resource

  • index[in] IVF-PQ index (passed by reference)

  • out_codes[out] the destination buffer [n_rows, ceildiv(index.pq_dim() * index.pq_bits(), 8)]. The length n_rows defines how many records to unpack, offset + n_rows must be smaller than or equal to the list size.

  • n_rows[in] how many codes to unpack

  • label[in] The id of the list (cluster) to decode.

  • offset[in] How many records in the list to skip.

void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_matrix_view<float, uint32_t, raft::row_major> out_vectors,
uint32_t label,
uint32_t offset
)#

Decode n_take consecutive records of a single list (cluster) in the compressed index starting at given offset.

Usage example:

  // We will reconstruct the fourth cluster
  uint32_t label = 3;
  // Get the list size
  uint32_t list_size = 0;
  raft::copy(&list_size, index.list_sizes().data_handle() + label, 1,
  resource::get_cuda_stream(res)); resource::sync_stream(res);
  // allocate the buffer for the output
  auto decoded_vectors = raft::make_device_matrix<float>(res, list_size, index.dim());
  // decode the whole list
  ivf_pq::helpers::codepacker::reconstruct_list_data(res, index, decoded_vectors.view(), label,
0);

Parameters:
  • res[in]

  • index[in]

  • out_vectors[out] the destination buffer [n_take, index.dim()]. The length n_take defines how many records to reconstruct, it must be smaller than the list size.

  • label[in] The id of the list (cluster) to decode.

  • offset[in] How many records in the list to skip.

void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_matrix_view<half, uint32_t, raft::row_major> out_vectors,
uint32_t label,
uint32_t offset
)#
void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_matrix_view<int8_t, uint32_t, raft::row_major> out_vectors,
uint32_t label,
uint32_t offset
)#
void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_matrix_view<uint8_t, uint32_t, raft::row_major> out_vectors,
uint32_t label,
uint32_t offset
)#
void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_vector_view<const uint32_t> in_cluster_indices,
raft::device_matrix_view<float, uint32_t, raft::row_major> out_vectors,
uint32_t label
)#

Decode a series of records of a single list (cluster) in the compressed index by their in-list offsets.

Usage example:

// We will reconstruct the fourth cluster
uint32_t label = 3;
// Create the selection vector
auto selected_indices = raft::make_device_vector<uint32_t>(res, 4);
... fill the indices ...
resource::sync_stream(res);
// allocate the buffer for the output
auto decoded_vectors = raft::make_device_matrix<float>(
                          res, selected_indices.size(), index.dim());
// decode the whole list
ivf_pq::helpers::codepacker::reconstruct_list_data(
    res, index, selected_indices.view(), decoded_vectors.view(), label);

Parameters:
  • res[in]

  • index[in]

  • in_cluster_indices[in] The offsets of the selected indices within the cluster.

  • out_vectors[out] the destination buffer [n_take, index.dim()]. The length n_take defines how many records to reconstruct, it must be smaller than the list size.

  • label[in] The id of the list (cluster) to decode.

void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_vector_view<const uint32_t> in_cluster_indices,
raft::device_matrix_view<half, uint32_t, raft::row_major> out_vectors,
uint32_t label
)#
void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_vector_view<const uint32_t> in_cluster_indices,
raft::device_matrix_view<int8_t, uint32_t, raft::row_major> out_vectors,
uint32_t label
)#
void reconstruct_list_data(
raft::resources const &res,
const index<int64_t> &index,
raft::device_vector_view<const uint32_t> in_cluster_indices,
raft::device_matrix_view<uint8_t, uint32_t, raft::row_major> out_vectors,
uint32_t label
)#
void extend_list_with_codes(
raft::resources const &res,
index<int64_t> *index,
raft::device_matrix_view<const uint8_t, uint32_t, raft::row_major> new_codes,
raft::device_vector_view<const int64_t, uint32_t, raft::row_major> new_indices,
uint32_t label
)#

Extend one list of the index in-place, by the list label, skipping the classification and encoding steps.

Usage example:

// We will extend the fourth cluster
uint32_t label = 3;
// We will fill 4 new vectors
uint32_t n_vec = 4;
// Indices of the new vectors
auto indices = raft::make_device_vector<uint32_t>(res, n_vec);
... fill the indices ...
auto new_codes = raft::make_device_matrix<uint8_t, uint32_t, row_major> new_codes(
    res, n_vec, index.pq_dim());
... fill codes ...
// extend list with new codes
ivf_pq::helpers::codepacker::extend_list_with_codes(
    res, &index, codes.view(), indices.view(), label);

Parameters:
  • res[in]

  • index[inout]

  • new_codes[in] flat PQ codes, one code per byte [n_rows, index.pq_dim()]

  • new_indices[in] source indices [n_rows]

  • label[in] the id of the target list (cluster).

void extend_list_with_contiguous_codes(
raft::resources const &res,
index<int64_t> *index,
raft::device_matrix_view<const uint8_t, uint32_t, raft::row_major> new_codes,
raft::device_vector_view<const int64_t, uint32_t, raft::row_major> new_indices,
uint32_t label
)#

Extend one list of the index in-place, by the list label, skipping the classification and encoding steps. Uses contiguous/packed codes format.

This is similar to extend_list_with_codes but takes codes in contiguous packed format [n_rows, ceildiv(pq_dim * pq_bits, 8)] instead of unpacked format [n_rows, pq_dim]. This works correctly with any pq_bits value.

Usage example:

// We will extend the fourth cluster
uint32_t label = 3;
// We will fill 4 new vectors
uint32_t n_vec = 4;
// Indices of the new vectors
auto indices = raft::make_device_vector<int64_t>(res, n_vec);
... fill the indices ...
// Allocate buffer for packed codes
uint32_t code_size = raft::ceildiv(index.pq_dim() * index.pq_bits(), 8u);
auto new_codes = raft::make_device_matrix<uint8_t, uint32_t, row_major>(res, n_vec, code_size);
... fill codes ...
// extend list with new codes
ivf_pq::helpers::codepacker::extend_list_with_contiguous_codes(
    res, &index, new_codes.view(), indices.view(), label);

Parameters:
  • res[in]

  • index[inout]

  • new_codes[in] flat contiguous PQ codes [n_rows, ceildiv(pq_dim * pq_bits, 8)]

  • new_indices[in] source indices [n_rows]

  • label[in] the id of the target list (cluster).

void extend_list(
raft::resources const &res,
index<int64_t> *index,
raft::device_matrix_view<const float, uint32_t, raft::row_major> new_vectors,
raft::device_vector_view<const int64_t, uint32_t, raft::row_major> new_indices,
uint32_t label
)#

Extend one list of the index in-place, by the list label, skipping the classification step.

Usage example:

// We will extend the fourth cluster
uint32_t label = 3;
// We will extend with 4 new vectors
uint32_t n_vec = 4;
// Indices of the new vectors
auto indices = raft::make_device_vector<uint32_t>(res, n_vec);
... fill the indices ...
auto new_vectors = raft::make_device_matrix<float, uint32_t, row_major> new_codes(
    res, n_vec, index.dim());
... fill vectors ...
// extend list with new vectors
ivf_pq::helpers::codepacker::extend_list(
    res, &index, new_vectors.view(), indices.view(), label);

Parameters:
  • res[in]

  • index[inout]

  • new_vectors[in] data to encode [n_rows, index.dim()]

  • new_indices[in] source indices [n_rows]

  • label[in] the id of the target list (cluster).

void extend_list(
raft::resources const &res,
index<int64_t> *index,
raft::device_matrix_view<const int8_t, uint32_t, raft::row_major> new_vectors,
raft::device_vector_view<const int64_t, uint32_t, raft::row_major> new_indices,
uint32_t label
)#
void extend_list(
raft::resources const &res,
index<int64_t> *index,
raft::device_matrix_view<const uint8_t, uint32_t, raft::row_major> new_vectors,
raft::device_vector_view<const int64_t, uint32_t, raft::row_major> new_indices,
uint32_t label
)#
void erase_list(
raft::resources const &res,
index<int64_t> *index,
uint32_t label
)#

Remove all data from a single list (cluster) in the index.

Usage example:

// We will erase the fourth cluster (label = 3)
ivf_pq::helpers::erase_list(res, &index, 3);

Parameters:
  • res[in]

  • index[inout]

  • label[in] the id of the target list (cluster).

void reset_index(const raft::resources &res, index<int64_t> *index)#

Public helper API to reset the data and indices ptrs, and the list sizes. Useful for externally modifying the index without going through the build stage. The data and indices of the IVF lists will be lost.

Usage example:

raft::resources res;
using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// initialize an empty index
ivf_pq::index<int64_t> index(res, index_params, D);
// reset the index's state and list sizes
ivf_pq::helpers::reset_index(res, &index);

Parameters:
  • res[in] raft resource

  • index[inout] pointer to IVF-PQ index

void pad_centers_with_norms(
raft::resources const &res,
raft::device_matrix_view<const float, uint32_t, raft::row_major> centers,
raft::device_matrix_view<float, uint32_t, raft::row_major> padded_centers
)#

Pad cluster centers with their L2 norms for efficient GEMM operations.

This function takes cluster centers and pads them with their L2 norms to create extended centers suitable for coarse search operations. The output has dimensions [n_centers, dim_ext] where dim_ext = round_up(dim + 1, 8).

Parameters:
  • res[in] raft resource

  • centers[in] cluster centers [n_centers, dim]

  • padded_centers[out] padded centers with norms [n_centers, dim_ext]

void pad_centers_with_norms(
raft::resources const &res,
raft::host_matrix_view<const float, uint32_t, raft::row_major> centers,
raft::device_matrix_view<float, uint32_t, raft::row_major> padded_centers
)#

Pad cluster centers with their L2 norms for efficient GEMM operations.

This function takes cluster centers and pads them with their L2 norms to create extended centers suitable for coarse search operations. The output has dimensions [n_centers, dim_ext] where dim_ext = round_up(dim + 1, 8).

Parameters:
  • res[in] raft resource

  • centers[in] cluster centers [n_centers, dim]

  • padded_centers[out] padded centers with norms [n_centers, dim_ext]

void rotate_padded_centers(
raft::resources const &res,
raft::device_matrix_view<const float, uint32_t, raft::row_major> padded_centers,
raft::device_matrix_view<const float, uint32_t, raft::row_major> rotation_matrix,
raft::device_matrix_view<float, uint32_t, raft::row_major> rotated_centers
)#

Rotate padded centers with the rotation matrix.

Parameters:
  • res[in] raft resource

  • padded_centers[in] padded centers [n_centers, dim_ext]

  • rotation_matrix[in] rotation matrix [rot_dim, dim]

  • rotated_centers[out] rotated centers [n_centers, rot_dim]

void extract_centers(
raft::resources const &res,
const index<int64_t> &index,
raft::device_matrix_view<float, int64_t, raft::row_major> cluster_centers
)#

Public helper API for fetching a trained index’s IVF centroids.

Usage example:

raft::resources res;
// allocate the buffer for the output centers
auto cluster_centers = raft::make_device_matrix<float, uint32_t>(
  res, index.n_lists(), index.dim());
// Extract the IVF centroids into the buffer
cuvs::neighbors::ivf_pq::helpers::extract_centers(res, index, cluster_centers.data_handle());

Parameters:
  • res[in] raft resource

  • index[in] IVF-PQ index (passed by reference)

  • cluster_centers[out] IVF cluster centers [index.n_lists(), index.dim]

void extract_centers(
raft::resources const &res,
const index<int64_t> &index,
raft::host_matrix_view<float, uint32_t, raft::row_major> cluster_centers
)#

Public helper API for fetching a trained index’s IVF centroids.

Usage example:

raft::resources res;
// allocate the buffer for the output centers
auto cluster_centers = raft::make_device_matrix<float, uint32_t>(
  res, index.n_lists(), index.dim());
// Extract the IVF centroids into the buffer
cuvs::neighbors::ivf_pq::helpers::extract_centers(res, index, cluster_centers.data_handle());

Parameters:
  • res[in] raft resource

  • index[in] IVF-PQ index (passed by reference)

  • cluster_centers[out] IVF cluster centers [index.n_lists(), index.dim]

void recompute_internal_state(
const raft::resources &res,
index<int64_t> *index
)#

Helper exposing the re-computation of list sizes and related arrays if IVF lists have been modified externally.

Usage example:

using namespace cuvs::neighbors;
raft::resources res;
// use default index parameters
ivf_pq::index_params index_params;
// initialize an empty index
ivf_pq::index<int64_t> index(res, index_params, D);
ivf_pq::helpers::reset_index(res, &index);
// resize the first IVF list to hold 5 records
auto spec = ivf_pq::list_spec_interleaved<uint32_t, int64_t>{
  index.pq_bits(), index.pq_dim(), index.conservative_memory_allocation()};
uint32_t new_size = 5;
ivf_pq::helpers::resize_list(res, index.lists()[0], spec, new_size, 0);
raft::update_device(index.list_sizes().data_handle(), &new_size, 1, stream);
// recompute the internal state of the index
ivf_pq::helpers::recompute_internal_state(res, &index);

Parameters:
  • res[in] raft resource

  • index[inout] pointer to IVF-PQ index

void make_rotation_matrix(
raft::resources const &res,
raft::device_matrix_view<float, uint32_t, raft::row_major> rotation_matrix,
bool force_random_rotation
)#

Generate a rotation matrix into user-provided buffer (standalone version).

This standalone helper generates a rotation matrix without requiring an index object. Users can call this to prepare a rotation matrix before building from precomputed data.

Usage example:

raft::resources res;
uint32_t dim = 128, pq_dim = 32;
uint32_t rot_dim = pq_dim * ((dim + pq_dim - 1) / pq_dim);  // rounded up

// Allocate rotation matrix buffer [rot_dim, dim]
auto rotation_matrix = raft::make_device_matrix<float, uint32_t>(res, rot_dim, dim);

// Generate the rotation matrix
ivf_pq::helpers::make_rotation_matrix(
  res, rotation_matrix.view(), true);

Parameters:
  • res[in] raft resource

  • rotation_matrix[out] Output buffer [rot_dim, dim] for the rotation matrix

  • force_random_rotation[in] If false and rot_dim == dim, creates identity matrix. If true or rot_dim != dim, creates random orthogonal matrix.

void resize_list(
raft::resources const &res,
std::shared_ptr<list_data_base<int64_t, uint32_t>> &orig_list,
const list_spec_flat<uint32_t, int64_t> &spec,
uint32_t new_used_size,
uint32_t old_used_size
)#

Resize an IVF-PQ list with flat layout.

This helper resizes an IVF list that uses the flat (non-interleaved) PQ code layout. If the new size exceeds the current capacity, a new list is allocated and existing data is copied. The function handles the type casting internally.

Usage example:

using namespace cuvs::neighbors;
raft::resources res;
// Assuming index uses FLAT layout
auto spec = ivf_pq::list_spec_flat<uint32_t, int64_t>{
  index.pq_bits(), index.pq_dim(), index.conservative_memory_allocation()};
uint32_t old_size = current_list_size;
uint32_t new_size = old_size + n_new_vectors;
ivf_pq::helpers::resize_list(res, index.lists()[label], spec, new_size, old_size);

Parameters:
  • res[in] raft resource

  • orig_list[inout] the list to resize (may be replaced with a new allocation)

  • spec[in] the list specification containing pq_bits, pq_dim, and allocation settings

  • new_used_size[in] the new size of the list (number of vectors)

  • old_used_size[in] the current size of the list (data up to this size is preserved)

void resize_list(
raft::resources const &res,
std::shared_ptr<list_data_base<int64_t, uint32_t>> &orig_list,
const list_spec_interleaved<uint32_t, int64_t> &spec,
uint32_t new_used_size,
uint32_t old_used_size
)#

Resize an IVF-PQ list with interleaved layout.

This helper resizes an IVF list that uses the interleaved PQ code layout (default). If the new size exceeds the current capacity, a new list is allocated and existing data is copied. The function handles the type casting internally.

Usage example:

using namespace cuvs::neighbors;
raft::resources res;
// Assuming index uses INTERLEAVED layout (default)
auto spec = ivf_pq::list_spec_interleaved<uint32_t, int64_t>{
  index.pq_bits(), index.pq_dim(), index.conservative_memory_allocation()};
uint32_t old_size = current_list_size;
uint32_t new_size = old_size + n_new_vectors;
ivf_pq::helpers::resize_list(res, index.lists()[label], spec, new_size, old_size);

Parameters:
  • res[in] raft resource

  • orig_list[inout] the list to resize (may be replaced with a new allocation)

  • spec[in] the list specification containing pq_bits, pq_dim, and allocation settings

  • new_used_size[in] the new size of the list (number of vectors)

  • old_used_size[in] the current size of the list (data up to this size is preserved)