Attention
The vector search and clustering algorithms in RAFT are being migrated to a new library dedicated to vector search called cuVS. We will continue to support the vector search algorithms in RAFT during this move, but will no longer update them after the RAPIDS 24.06 (June) release. We plan to complete the migration by RAPIDS 24.10 (October) release and they will be removed from RAFT altogether in the 24.12 (December) release.
Bitset#
#include <raft/core/bitset.cuh>
namespace raft::core
-
template<typename bitset_t = uint32_t, typename index_t = uint32_t>
struct bitset_view# - #include <bitset.hpp>
View of a RAFT Bitset.
This lightweight structure stores a pointer to a bitset in device memory with it’s length. It provides a test() device function to check if a given index is set in the bitset.
- Template Parameters:
bitset_t – Underlying type of the bitset array. Default is uint32_t.
index_t – Indexing type used. Default is uint32_t.
Public Functions
-
inline _RAFT_HOST_DEVICE bitset_view(bitset_t *bitset_ptr, index_t bitset_len, index_t original_nbits = 0)#
Create a bitset view from a device pointer to the bitset.
- Parameters:
bitset_ptr – Device pointer to the bitset
bitset_len – Number of bits in the bitset
original_nbits – Original number of bits used when the bitset was created, to handle potential mismatches of data types. This is useful for using ANN indexes when a bitset was originally created with a different data type than the ones currently supported in cuVS ANN indexes.
-
inline _RAFT_HOST_DEVICE bitset_view(raft::device_vector_view<bitset_t, index_t> bitset_span, index_t bitset_len, index_t original_nbits = 0)#
Create a bitset view from a device vector view of the bitset.
- Parameters:
bitset_span – Device vector view of the bitset
bitset_len – Number of bits in the bitset
original_nbits – Original number of bits used when the bitset was created, to handle potential mismatches of data types. This is useful for using ANN indexes when a bitset was originally created with a different data type than the ones currently supported in cuVS ANN indexes.
-
inline _RAFT_HOST_DEVICE auto test(const index_t sample_index) const -> bool#
Device function to test if a given index is set in the bitset.
- Parameters:
sample_index – Single index to test
- Returns:
bool True if index has not been unset in the bitset
-
inline _RAFT_HOST_DEVICE auto operator[](const index_t sample_index) const -> bool#
Device function to test if a given index is set in the bitset.
- Parameters:
sample_index – Single index to test
- Returns:
bool True if index has not been unset in the bitset
- inline _RAFT_HOST_DEVICE void set (const index_t sample_index, bool set_value) const
Device function to set a given index to set_value in the bitset.
- Parameters:
sample_index – index to set
set_value – Value to set the bit to (true or false)
-
inline _RAFT_HOST_DEVICE auto size() const -> index_t#
Get the number of bits of the bitset representation.
-
inline _RAFT_HOST_DEVICE auto n_elements() const -> index_t#
Get the number of elements used by the bitset representation.
-
void count(const raft::resources &res, raft::device_scalar_view<index_t> count_gpu_scalar) const#
Returns the number of bits set to true in count_gpu_scalar.
- Parameters:
res – [in] RAFT resources
count_gpu_scalar – [out] Device scalar to store the count
-
inline index_t count(const raft::resources &res) const#
Returns the number of bits set to true.
- Parameters:
res – RAFT resources
- Returns:
index_t Number of bits set to true
-
void repeat(const raft::resources &res, index_t times, bitset_t *output_device_ptr) const#
Repeats the bitset data and copies it to the output device pointer.
This function takes the original bitset data stored in the device memory and repeats it a specified number of times into a new location in the device memory. The bits are copied bit-by-bit to ensure that even if the number of bits (bitset_len_) is not a multiple of the bitset element size (e.g., 32 for uint32_t), the bits are tightly packed without any gaps between rows.
The caller must ensure that the output device pointer has enough memory allocated to hold
times * bitset_len
bits, wherebitset_len
is the number of bits in the original bitset. This function uses Thrust parallel algorithms to efficiently perform the operation on the GPU.- Parameters:
res – RAFT resources for managing CUDA streams and execution policies.
times – Number of times the bitset data should be repeated in the output.
output_device_ptr – Device pointer where the repeated bitset data will be stored.
-
double sparsity(const raft::resources &res) const#
Calculate the sparsity (fraction of 0s) of the bitset.
This function computes the sparsity of the bitset, defined as the ratio of unset bits (0s) to the total number of bits in the set. If the total number of bits is zero, the function returns 1.0, indicating the set is fully sparse.
This API will synchronize on the stream of
res
.- Parameters:
res – RAFT resources for managing CUDA streams and execution policies.
- Returns:
double The sparsity of the bitset, i.e., the fraction of unset bits.
-
template<typename csr_matrix_t>
void to_csr(const raft::resources &res, csr_matrix_t &csr) const# Converts to a Compressed Sparse Row (CSR) format matrix.
This method transforms the bitset view into a CSR matrix representation, where each ‘1’ bit in the bitset corresponds to a non-zero entry in the CSR matrix. The bitset format supports only a single-row matrix, so if the CSR matrix requires multiple rows, the bitset data is repeated for each row in the output.
Example usage:
#include <raft/core/resource/cuda_stream.hpp> #include <raft/sparse/convert/csr.cuh> #include <rmm/device_uvector.hpp> using bitset_t = uint32_t; using index_t = int; using value_t = float; raft::resources handle; auto stream = resource::get_cuda_stream(handle); index_t n_rows = 3; index_t n_cols = 30; // Compute bitset size and initialize device memory index_t bitset_size = (n_cols + sizeof(bitset_t) * 8 - 1) / (sizeof(bitset_t) * 8); rmm::device_uvector<bitset_t> bitset_d(bitset_size, stream); std::vector<bitset_t> bitset_h = { bitset_t(0b11001010), }; // Example bitset, with 4 non-zero entries. raft::copy(bitset_d.data(), bitset_h.data(), bitset_h.size(), stream); // Create bitset view and CSR matrix auto bitset_view = raft::core::bitset_view<bitset_t, index_t>(bitset_d.data(), n_cols); auto csr = raft::make_device_csr_matrix<value_t, index_t>(handle, n_rows, n_cols, 4 * n_rows); // Convert bitset to CSR bitset_view.to_csr(handle, csr); resource::sync_stream(handle); // Results: // csr.indptr = [0, 4, 8, 12]; // csr.indices = [1, 3, 6, 7, // 1, 3, 6, 7, // 1, 3, 6, 7]; // csr.values = [1, 1, 1, 1, // 1, 1, 1, 1, // 1, 1, 1, 1];
The caller must ensure that: The
csr
matrix is pre-allocated with dimensions and non-zero count matching the expected output, i.e.,nnz_for_csr = nnz_for_bitset * n_rows
.- Template Parameters:
csr_matrix_t – Specifies the CSR matrix type, constrained to raft::device_csr_matrix.
- Parameters:
res – [in] RAFT resources for managing CUDA streams and execution policies.
csr – [out] Output parameter where the resulting CSR matrix is stored. Each ‘1’ bit in the bitset corresponds to a non-zero element in the CSR matrix.
Public Static Functions
-
static inline size_t eval_n_elements(size_t bitset_len)#
Calculates the number of
bitset_t
elements required to store a bitset.This function computes the number of
bitset_t
elements needed to store a bitset, ensuring that all bits are accounted for. If the bitset length is not a multiple of thebitset_t
size (in bits), the calculation rounds up to include the remaining bits in an additionalbitset_t
element.- Parameters:
bitset_len – The total length of the bitset in bits.
- Returns:
size_t The number of
bitset_t
elements required to store the bitset.
-
template<typename bitset_t = uint32_t, typename index_t = uint32_t>
struct bitset# - #include <bitset.hpp>
RAFT Bitset.
This structure encapsulates a bitset in device memory. It provides a view() method to get a device-usable lightweight view of the bitset. Each index is represented by a single bit in the bitset. The total number of bytes used is ceil(bitset_len / 8).
- Template Parameters:
bitset_t – Underlying type of the bitset array. Default is uint32_t.
index_t – Indexing type used. Default is uint32_t.
Public Functions
-
bitset(const raft::resources &res, raft::device_vector_view<const index_t, index_t> mask_index, index_t bitset_len, bool default_value = true)#
Construct a new bitset object with a list of indices to unset.
- Parameters:
res – RAFT resources
mask_index – List of indices to unset in the bitset
bitset_len – Length of the bitset
default_value – Default value to set the bits to. Default is true.
-
bitset(const raft::resources &res, index_t bitset_len, bool default_value = true)#
Construct a new bitset object.
- Parameters:
res – RAFT resources
bitset_len – Length of the bitset
default_value – Default value to set the bits to. Default is true.
-
inline raft::core::bitset_view<bitset_t, index_t> view()#
Create a device-usable view of the bitset.
- Returns:
-
inline raft::device_vector_view<bitset_t, index_t> to_mdspan()#
Get an mdspan view of the current bitset.
-
void resize(const raft::resources &res, index_t new_bitset_len, bool default_value = true)#
Resize the bitset. If the requested size is larger, new memory is allocated and set to the default value.
- Parameters:
res – RAFT resources
new_bitset_len – new size of the bitset
default_value – default value to initialize the new bits to
-
template<typename output_t = bool>
void test(const raft::resources &res, raft::device_vector_view<const index_t, index_t> queries, raft::device_vector_view<output_t, index_t> output) const# Test a list of indices in a bitset.
- Template Parameters:
output_t – Output type of the test. Default is bool.
- Parameters:
res – RAFT resources
queries – List of indices to test
output – List of outputs
-
void set(const raft::resources &res, raft::device_vector_view<const index_t, index_t> mask_index, bool set_value = false)#
Set a list of indices in a bitset to set_value.
- Parameters:
res – RAFT resources
mask_index – indices to remove from the bitset
set_value – Value to set the bits to (true or false)
-
void flip(const raft::resources &res)#
Flip all the bits in a bitset.
- Parameters:
res – RAFT resources
-
void reset(const raft::resources &res, bool default_value = true)#
Reset the bits in a bitset.
- Parameters:
res – RAFT resources
default_value – Value to set the bits to (true or false)
-
void count(const raft::resources &res, raft::device_scalar_view<index_t> count_gpu_scalar)#
Returns the number of bits set to true in count_gpu_scalar.
- Parameters:
res – [in] RAFT resources
count_gpu_scalar – [out] Device scalar to store the count
-
inline index_t count(const raft::resources &res)#
Returns the number of bits set to true.
- Parameters:
res – RAFT resources
- Returns:
index_t Number of bits set to true
-
inline bool any(const raft::resources &res)#
Checks if any of the bits are set to true in the bitset.
- Parameters:
res – RAFT resources
-
inline bool all(const raft::resources &res)#
Checks if all of the bits are set to true in the bitset.
- Parameters:
res – RAFT resources
-
inline bool none(const raft::resources &res)#
Checks if none of the bits are set to true in the bitset.
- Parameters:
res – RAFT resources