Refinement#

Candidate refinement methods for nearest neighbors search

#include <cuvs/neighbors/refine.hpp>

namespace cuvs::neighbors

Index#

void refine(
raft::resources const &handle,
raft::device_matrix_view<const float, int64_t, raft::row_major> dataset,
raft::device_matrix_view<const float, int64_t, raft::row_major> queries,
raft::device_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::device_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::device_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] device matrix that stores the dataset [n_rows, dims]

  • queries[in] device matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] device matrix that stores the refined indices [n_queries, k]

  • distances[out] device matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::device_matrix_view<const half, int64_t, raft::row_major> dataset,
raft::device_matrix_view<const half, int64_t, raft::row_major> queries,
raft::device_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::device_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::device_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] device matrix that stores the dataset [n_rows, dims]

  • queries[in] device matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] device matrix that stores the refined indices [n_queries, k]

  • distances[out] device matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
raft::device_matrix_view<const int8_t, int64_t, raft::row_major> queries,
raft::device_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::device_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::device_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] device matrix that stores the dataset [n_rows, dims]

  • queries[in] device matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] device matrix that stores the refined indices [n_queries, k]

  • distances[out] device matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
raft::device_matrix_view<const uint8_t, int64_t, raft::row_major> queries,
raft::device_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::device_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::device_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] device matrix that stores the dataset [n_rows, dims]

  • queries[in] device matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] device matrix that stores the refined indices [n_queries, k]

  • distances[out] device matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::host_matrix_view<const float, int64_t, raft::row_major> dataset,
raft::host_matrix_view<const float, int64_t, raft::row_major> queries,
raft::host_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::host_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::host_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] host matrix that stores the dataset [n_rows, dims]

  • queries[in] host matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] host matrix with indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] host matrix that stores the refined indices [n_queries, k]

  • distances[out] host matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::host_matrix_view<const float, int64_t, raft::row_major> dataset,
raft::host_matrix_view<const float, int64_t, raft::row_major> queries,
raft::host_matrix_view<const uint32_t, int64_t, raft::row_major> neighbor_candidates,
raft::host_matrix_view<uint32_t, int64_t, raft::row_major> indices,
raft::host_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] host matrix that stores the dataset [n_rows, dims]

  • queries[in] host matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] host matrix with indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] host matrix that stores the refined indices [n_queries, k]

  • distances[out] host matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::host_matrix_view<const half, int64_t, raft::row_major> dataset,
raft::host_matrix_view<const half, int64_t, raft::row_major> queries,
raft::host_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::host_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::host_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] host matrix that stores the dataset [n_rows, dims]

  • queries[in] host matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] host matrix with indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] host matrix that stores the refined indices [n_queries, k]

  • distances[out] host matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> dataset,
raft::host_matrix_view<const int8_t, int64_t, raft::row_major> queries,
raft::host_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::host_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::host_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] host matrix that stores the dataset [n_rows, dims]

  • queries[in] host matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] host matrix with indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] host matrix that stores the refined indices [n_queries, k]

  • distances[out] host matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default

void refine(
raft::resources const &handle,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> dataset,
raft::host_matrix_view<const uint8_t, int64_t, raft::row_major> queries,
raft::host_matrix_view<const int64_t, int64_t, raft::row_major> neighbor_candidates,
raft::host_matrix_view<int64_t, int64_t, raft::row_major> indices,
raft::host_matrix_view<float, int64_t, raft::row_major> distances,
cuvs::distance::DistanceType metric = cuvs::distance::DistanceType::L2Unexpanded,
)#

Refine nearest neighbor search.

Refinement is an operation that follows an approximate NN search. The approximate search has already selected n_candidates neighbor candidates for each query. We narrow it down to k neighbors. For each query, we calculate the exact distance between the query and its n_candidates neighbor candidate, and select the k nearest ones.

The k nearest neighbors and distances are returned.

Example usage

using namespace cuvs::neighbors;
// use default index parameters
ivf_pq::index_params index_params;
// create and fill the index from a [N, D] dataset
auto index = ivf_pq::build(handle, index_params, dataset);
// use default search parameters
ivf_pq::search_params search_params;
// search m = 4 * k nearest neighbours for each of the N queries
ivf_pq::search(handle, search_params, index, queries, neighbor_candidates,
               out_dists_tmp);
// refine it to the k nearest one
refine(handle, dataset, queries, neighbor_candidates, out_indices, out_dists,
        index.metric());

Parameters:
  • handle[in] the raft handle

  • dataset[in] host matrix that stores the dataset [n_rows, dims]

  • queries[in] host matrix of the queries [n_queris, dims]

  • neighbor_candidates[in] host matrix with indices of candidate vectors [n_queries, n_candidates], where n_candidates >= k

  • indices[out] host matrix that stores the refined indices [n_queries, k]

  • distances[out] host matrix that stores the refined distances [n_queries, k]

  • metric[in] distance metric to use. Euclidean (L2) is used by default