Public Member Functions | Static Public Member Functions | List of all members
rapidsmpf::BloomFilter Struct Reference

A bloom filter, used for approximate set membership queries. More...

#include <bloom_filter.hpp>

Public Member Functions

 BloomFilter (std::size_t num_blocks, std::uint64_t seed, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr)
 Create a filter. More...
 
void add (cudf::table_view const &values_to_hash, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr)
 Add values to the filter. More...
 
void merge (BloomFilter &other, rmm::cuda_stream_view stream)
 Merge two filters, computing their union. More...
 
rmm::device_uvector< bool > contains (cudf::table_view const &values, rmm::cuda_stream_view stream, rmm::device_async_resource_ref mr)
 Return a mask of which rows are contained in the filter. More...
 
rmm::cuda_stream_view stream () const noexcept
 
void * data () noexcept
 
std::size_t size () const noexcept
 

Static Public Member Functions

static std::size_t fitting_num_blocks (std::size_t l2size) noexcept
 

Detailed Description

A bloom filter, used for approximate set membership queries.

Definition at line 19 of file bloom_filter.hpp.

Constructor & Destructor Documentation

◆ BloomFilter()

rapidsmpf::BloomFilter::BloomFilter ( std::size_t  num_blocks,
std::uint64_t  seed,
rmm::cuda_stream_view  stream,
rmm::device_async_resource_ref  mr 
)

Create a filter.

Parameters
num_blocksNumber of blocks in the filter.
seedSeed used for hashing each value.
streamCUDA stream for allocations and device operations.
mrMemory resource for allocations.

Member Function Documentation

◆ add()

void rapidsmpf::BloomFilter::add ( cudf::table_view const &  values_to_hash,
rmm::cuda_stream_view  stream,
rmm::device_async_resource_ref  mr 
)

Add values to the filter.

Parameters
values_to_hashtable of values to hash (with cudf::hashing::xxhash_64())
streamCUDA stream for allocations and device operations.
mrMemory resource for allocations.

◆ contains()

rmm::device_uvector<bool> rapidsmpf::BloomFilter::contains ( cudf::table_view const &  values,
rmm::cuda_stream_view  stream,
rmm::device_async_resource_ref  mr 
)

Return a mask of which rows are contained in the filter.

Parameters
valuesValue to check for set membership
streamCUDA stream for allocations and device operations.
mrMemory resource for allocations.
Returns
Mask vector to be used for filtering the table.

◆ data()

void* rapidsmpf::BloomFilter::data ( )
noexcept
Returns
Pointer to the underlying storage.

◆ fitting_num_blocks()

static std::size_t rapidsmpf::BloomFilter::fitting_num_blocks ( std::size_t  l2size)
staticnoexcept
Returns
Number of blocks to use if the filter should fit in a given L2 cache size.
Parameters
l2sizeSize of the L2 cache in bytes.

◆ merge()

void rapidsmpf::BloomFilter::merge ( BloomFilter other,
rmm::cuda_stream_view  stream 
)

Merge two filters, computing their union.

Parameters
otherOther filter to merge into this one.
streamCUDA stream for device operations.
Exceptions
std::logic_errorIf other is not compatible with this filter.

◆ size()

std::size_t rapidsmpf::BloomFilter::size ( ) const
noexcept
Returns
Size in bytes of the underlying storage.

◆ stream()

rmm::cuda_stream_view rapidsmpf::BloomFilter::stream ( ) const
noexcept
Returns
The stream the underlying storage is valid on.

The documentation for this struct was generated from the following file: