Files | Functions
Bitmask Operations

Files

file  null_mask.hpp
 APIs for managing validity bitmasks.
 

Functions

size_type cudf::state_null_count (mask_state state, size_type size)
 Returns the null count for a null mask of the specified state representing size elements. More...
 
std::size_t cudf::bitmask_allocation_size_bytes (size_type number_of_bits, std::size_t padding_boundary=64)
 Computes the required bytes necessary to represent the specified number of bits with a given padding boundary. More...
 
size_type cudf::num_bitmask_words (size_type number_of_bits)
 Returns the number of bitmask_type words required to represent the specified number of bits. More...
 
rmm::device_buffer cudf::create_null_mask (size_type size, mask_state state, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a device_buffer for use as a null value indicator bitmask of a column. More...
 
void cudf::set_null_mask (bitmask_type *bitmask, size_type begin_bit, size_type end_bit, bool valid)
 Sets a pre-allocated bitmask buffer to a given state in the range [begin_bit, end_bit) More...
 
cudf::size_type cudf::count_set_bits (bitmask_type const *bitmask, size_type start, size_type stop)
 Given a bitmask, counts the number of set (1) bits in the range [start, stop) More...
 
cudf::size_type cudf::count_unset_bits (bitmask_type const *bitmask, size_type start, size_type stop)
 Given a bitmask, counts the number of unset (0) bits in the range [start, stop). More...
 
std::vector< size_type > cudf::segmented_count_set_bits (bitmask_type const *bitmask, std::vector< cudf::size_type > const &indices)
 Given a bitmask, counts the number of set (1) bits in every range [indices[2*i], indices[(2*i)+1]) (where 0 <= i < indices.size() / 2). More...
 
std::vector< size_type > cudf::segmented_count_unset_bits (bitmask_type const *bitmask, std::vector< cudf::size_type > const &indices)
 Given a bitmask, counts the number of unset (0) bits in every range [indices[2*i], indices[(2*i)+1]) (where 0 <= i < indices.size() / 2). More...
 
rmm::device_buffer cudf::copy_bitmask (bitmask_type const *mask, size_type begin_bit, size_type end_bit, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a device_buffer from a slice of bitmask defined by a range of indices [begin_bit, end_bit). More...
 
rmm::device_buffer cudf::copy_bitmask (column_view const &view, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Copies view's bitmask from the bits [view.offset(), view.offset() + view.size()) into a device_buffer More...
 
rmm::device_buffer cudf::bitmask_and (table_view const &view, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a bitwise AND of the bitmasks of columns of a table. More...
 

Detailed Description

Function Documentation

◆ bitmask_allocation_size_bytes()

std::size_t cudf::bitmask_allocation_size_bytes ( size_type  number_of_bits,
std::size_t  padding_boundary = 64 
)

Computes the required bytes necessary to represent the specified number of bits with a given padding boundary.

Note
The Arrow specification for the null bitmask requires a 64B padding boundary.
Parameters
number_of_bitsThe number of bits that need to be represented
padding_boundaryThe value returned will be rounded up to a multiple of this value
Returns
std::size_t The necessary number of bytes

◆ bitmask_and()

Returns a bitwise AND of the bitmasks of columns of a table.

If any of the columns isn't nullable, it is considered all valid. If no column in the table is nullable, an empty bitmask is returned.

Parameters
viewThe table of columns
mrDevice memory resource used to allocate the returned device_buffer
Returns
rmm::device_buffer Output bitmask

◆ copy_bitmask() [1/2]

rmm::device_buffer cudf::copy_bitmask ( bitmask_type const *  mask,
size_type  begin_bit,
size_type  end_bit,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a device_buffer from a slice of bitmask defined by a range of indices [begin_bit, end_bit).

Returns empty device_buffer if bitmask == nullptr.

Exceptions
cudf::logic_errorif begin_bit > end_bit
cudf::logic_errorif begin_bit < 0
Parameters
maskBitmask residing in device memory whose bits will be copied
begin_bitIndex of the first bit to be copied (inclusive)
end_bitIndex of the last bit to be copied (exclusive)
mrDevice memory resource used to allocate the returned device_buffer
Returns
rmm::device_buffer A device_buffer containing the bits [begin_bit, end_bit) from mask.

◆ copy_bitmask() [2/2]

Copies view's bitmask from the bits [view.offset(), view.offset() + view.size()) into a device_buffer

Returns empty device_buffer if the column is not nullable

Parameters
viewColumn view whose bitmask needs to be copied
mrDevice memory resource used to allocate the returned device_buffer
Returns
rmm::device_buffer A device_buffer containing the bits [view.offset(), view.offset() + view.size()) from view's bitmask.

◆ count_set_bits()

cudf::size_type cudf::count_set_bits ( bitmask_type const *  bitmask,
size_type  start,
size_type  stop 
)

Given a bitmask, counts the number of set (1) bits in the range [start, stop)

Returns 0 if bitmask == nullptr.

Exceptions
cudf::logic_errorif start > stop
cudf::logic_errorif start < 0
Parameters
bitmaskBitmask residing in device memory whose bits will be counted
start_bitIndex of the first bit to count (inclusive)
stop_bitIndex of the last bit to count (exclusive)
Returns
The number of non-zero bits in the specified range

◆ count_unset_bits()

cudf::size_type cudf::count_unset_bits ( bitmask_type const *  bitmask,
size_type  start,
size_type  stop 
)

Given a bitmask, counts the number of unset (0) bits in the range [start, stop).

Returns 0 if bitmask == nullptr.

Exceptions
cudf::logic_errorif start > stop
cudf::logic_errorif start < 0
Parameters
bitmaskBitmask residing in device memory whose bits will be counted
start_bitIndex of the first bit to count (inclusive)
stop_bitIndex of the last bit to count (exclusive)
Returns
The number of zero bits in the specified range

◆ create_null_mask()

rmm::device_buffer cudf::create_null_mask ( size_type  size,
mask_state  state,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a device_buffer for use as a null value indicator bitmask of a column.

Parameters
sizeThe number of elements to be represented by the mask
stateThe desired state of the mask
mrDevice memory resource used to allocate the returned device_buffer.
Returns
rmm::device_buffer A device_buffer for use as a null bitmask satisfying the desired size and state

◆ num_bitmask_words()

size_type cudf::num_bitmask_words ( size_type  number_of_bits)

Returns the number of bitmask_type words required to represent the specified number of bits.

Unlike bitmask_allocation_size_bytes, which returns the number of bytes needed for a bitmask allocation (including padding), this function returns the actual number bitmask_type elements necessary to represent number_of_bits. This is useful when one wishes to process all of the bits in a bitmask and ignore the padding/slack bits.

Parameters
number_of_bitsThe number of bits that need to be represented
Returns
size_type The necessary number of bitmask_type elements

◆ segmented_count_set_bits()

std::vector<size_type> cudf::segmented_count_set_bits ( bitmask_type const *  bitmask,
std::vector< cudf::size_type > const &  indices 
)

Given a bitmask, counts the number of set (1) bits in every range [indices[2*i], indices[(2*i)+1]) (where 0 <= i < indices.size() / 2).

Returns an empty vector if bitmask == nullptr.

Exceptions
cudf::logic_errorif indices.size() % 2 != 0
cudf::logic_errorif indices[2*i] < 0 or indices[2*i] > indices[(2*i)+1]
Parameters
[in]bitmaskBitmask residing in device memory whose bits will be counted
[in]indicesA vector of indices used to specify ranges to count the number of set bits
Returns
std::vector<size_type> A vector storing the number of non-zero bits in the specified ranges

◆ segmented_count_unset_bits()

std::vector<size_type> cudf::segmented_count_unset_bits ( bitmask_type const *  bitmask,
std::vector< cudf::size_type > const &  indices 
)

Given a bitmask, counts the number of unset (0) bits in every range [indices[2*i], indices[(2*i)+1]) (where 0 <= i < indices.size() / 2).

Returns an empty vector if bitmask == nullptr.

Exceptions
cudf::logic_errorif indices.size() % 2 != 0
cudf::logic_errorif indices[2*i] < 0 or indices[2*i] > indices[(2*i)+1]
Parameters
[in]bitmaskBitmask residing in device memory whose bits will be counted
[in]indicesA vector of indices used to specify ranges to count the number of unset bits
Returns
std::vector<size_type> A vector storing the number of zero bits in the specified ranges

◆ set_null_mask()

void cudf::set_null_mask ( bitmask_type *  bitmask,
size_type  begin_bit,
size_type  end_bit,
bool  valid 
)

Sets a pre-allocated bitmask buffer to a given state in the range [begin_bit, end_bit)

Sets [begin_bit, end_bit) bits of bitmask to valid if valid==true or null otherwise.

Parameters
bitmaskPointer to bitmask (e.g. returned by column_view.null_mask())
begin_bitIndex of the first bit to set (inclusive)
end_bitIndex of the last bit to set (exclusive)
validIf true set all entries to valid; otherwise, set all to null.

◆ state_null_count()

size_type cudf::state_null_count ( mask_state  state,
size_type  size 
)

Returns the null count for a null mask of the specified state representing size elements.

Parameters
stateThe state of the null mask
sizeThe number of elements represented by the mask
Returns
size_type The count of null elements