Modules | Files | Classes | Enumerations | Functions

Modules

 Concatenating
 
 Gathering
 
 Scattering
 
 Slicing
 
 Splitting
 
 Shifting
 

Files

file  copying.hpp
 Column APIs for gather, scatter, split, slice, etc.
 

Classes

struct  cudf::packed_columns
 Column data in a serialized format. More...
 
struct  cudf::packed_table
 The result(s) of a contiguous_split More...
 

Enumerations

enum  cudf::out_of_bounds_policy : bool { NULLIFY, cudf::out_of_bounds_policy::DONT_CHECK }
 Policy to account for possible out-of-bounds indices. More...
 
enum  cudf::mask_allocation_policy { cudf::mask_allocation_policy::NEVER, cudf::mask_allocation_policy::RETAIN, cudf::mask_allocation_policy::ALWAYS }
 Indicates when to allocate a mask, based on an existing mask. More...
 
enum  cudf::sample_with_replacement : bool { FALSE, TRUE }
 Indicates whether a row can be sampled more than once.
 

Functions

std::unique_ptr< columncudf::empty_like (column_view const &input)
 Initializes and returns an empty column of the same type as the input. More...
 
std::unique_ptr< columncudf::allocate_like (column_view const &input, mask_allocation_policy mask_alloc=mask_allocation_policy::RETAIN, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates an uninitialized new column of the same size and type as the input. Supports only fixed-width types. More...
 
std::unique_ptr< columncudf::allocate_like (column_view const &input, size_type size, mask_allocation_policy mask_alloc=mask_allocation_policy::RETAIN, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates an uninitialized new column of the specified size and same type as the input. Supports only fixed-width types. More...
 
std::unique_ptr< tablecudf::empty_like (table_view const &input_table)
 Creates a table of empty columns with the same types as the input_table More...
 
void cudf::copy_range_in_place (column_view const &source, mutable_column_view &target, size_type source_begin, size_type source_end, size_type target_begin)
 Copies a range of elements in-place from one column to another. More...
 
std::unique_ptr< columncudf::copy_range (column_view const &source, column_view const &target, size_type source_begin, size_type source_end, size_type target_begin, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Copies a range of elements out-of-place from one column to another. More...
 
packed_columns cudf::pack (cudf::table_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Deep-copy a table_view into a serialized contiguous memory format. More...
 
packed_columns::metadata cudf::pack_metadata (table_view const &table, uint8_t const *contiguous_buffer, size_t buffer_size)
 Produce the metadata used for packing a table stored in a contiguous buffer. More...
 
table_view cudf::unpack (packed_columns const &input)
 Deserialize the result of cudf::pack More...
 
table_view cudf::unpack (uint8_t const *metadata, uint8_t const *gpu_data)
 Deserialize the result of cudf::pack More...
 
std::unique_ptr< columncudf::copy_if_else (column_view const &lhs, column_view const &rhs, column_view const &boolean_mask, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask. More...
 
std::unique_ptr< columncudf::copy_if_else (scalar const &lhs, column_view const &rhs, column_view const &boolean_mask, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask. More...
 
std::unique_ptr< columncudf::copy_if_else (column_view const &lhs, scalar const &rhs, column_view const &boolean_mask, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask. More...
 
std::unique_ptr< columncudf::copy_if_else (scalar const &lhs, scalar const &rhs, column_view const &boolean_mask, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask. More...
 
std::unique_ptr< scalarcudf::get_element (column_view const &input, size_type index, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Get the element at specified index from a column. More...
 
std::unique_ptr< tablecudf::sample (table_view const &input, size_type const n, sample_with_replacement replacement=sample_with_replacement::FALSE, int64_t const seed=0, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Gather n samples from given input randomly. More...
 

Detailed Description

Enumeration Type Documentation

◆ mask_allocation_policy

Indicates when to allocate a mask, based on an existing mask.

Enumerator
NEVER 

Do not allocate a null mask, regardless of input.

RETAIN 

Allocate a null mask if the input contains one.

ALWAYS 

Allocate a null mask, regardless of input.

Definition at line 171 of file copying.hpp.

◆ out_of_bounds_policy

enum cudf::out_of_bounds_policy : bool
strong

Policy to account for possible out-of-bounds indices.

NULLIFY means to nullify output values corresponding to out-of-bounds gather_map values. DONT_CHECK means do not check whether the indices are out-of-bounds, for better performance.

Enumerator
DONT_CHECK 

Output values corresponding to out-of-bounds indices are null.

No bounds checking is performed, better performance

Definition at line 43 of file copying.hpp.

Function Documentation

◆ allocate_like() [1/2]

std::unique_ptr<column> cudf::allocate_like ( column_view const &  input,
mask_allocation_policy  mask_alloc = mask_allocation_policy::RETAIN,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates an uninitialized new column of the same size and type as the input. Supports only fixed-width types.

Parameters
[in]inputImmutable view of input column to emulate
[in]mask_allocOptional, Policy for allocating null mask. Defaults to RETAIN.
[in]mrDevice memory resource used to allocate the returned column's device memory
Returns
A column with sufficient uninitialized capacity to hold the same number of elements as input of the same type as input.type()

◆ allocate_like() [2/2]

std::unique_ptr<column> cudf::allocate_like ( column_view const &  input,
size_type  size,
mask_allocation_policy  mask_alloc = mask_allocation_policy::RETAIN,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates an uninitialized new column of the specified size and same type as the input. Supports only fixed-width types.

Parameters
[in]inputImmutable view of input column to emulate
[in]sizeThe desired number of elements that the new column should have capacity for
[in]mask_allocOptional, Policy for allocating null mask. Defaults to RETAIN.
[in]mrDevice memory resource used to allocate the returned column's device memory
Returns
A column with sufficient uninitialized capacity to hold the specified number of elements as input of the same type as input.type()

◆ copy_if_else() [1/4]

std::unique_ptr<column> cudf::copy_if_else ( column_view const &  lhs,
column_view const &  rhs,
column_view const &  boolean_mask,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask.

Selects each element i in the output column from either rhs or lhs using the following rule: output[i] = (boolean_mask.valid(i) and boolean_mask[i]) ? lhs[i] : rhs[i]

Exceptions
cudf::logic_errorif lhs and rhs are not of the same type
cudf::logic_errorif lhs and rhs are not of the same length
cudf::logic_errorif boolean mask is not of type bool
cudf::logic_errorif boolean mask is not of the same length as lhs and rhs
Parameters
[in]lhsleft-hand column_view
[in]rhsright-hand column_view
[in]boolean_maskcolumn of type_id::BOOL8 representing "left (true) / right (false)" boolean for each element. Null element represents false.
[in]mrDevice memory resource used to allocate the returned column's device memory
Returns
new column with the selected elements

◆ copy_if_else() [2/4]

std::unique_ptr<column> cudf::copy_if_else ( column_view const &  lhs,
scalar const &  rhs,
column_view const &  boolean_mask,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask.

Selects each element i in the output column from either rhs or lhs using the following rule: output[i] = (boolean_mask.valid(i) and boolean_mask[i]) ? lhs[i] : rhs

Exceptions
cudf::logic_errorif lhs and rhs are not of the same type
cudf::logic_errorif boolean mask is not of type bool
cudf::logic_errorif boolean mask is not of the same length as lhs
Parameters
[in]lhsleft-hand column_view
[in]rhsright-hand scalar
[in]boolean_maskcolumn of type_id::BOOL8 representing "left (true) / right (false)" boolean for each element. Null element represents false.
[in]mrDevice memory resource used to allocate the returned column's device memory
Returns
new column with the selected elements

◆ copy_if_else() [3/4]

std::unique_ptr<column> cudf::copy_if_else ( scalar const &  lhs,
column_view const &  rhs,
column_view const &  boolean_mask,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask.

Selects each element i in the output column from either rhs or lhs using the following rule: output[i] = (boolean_mask.valid(i) and boolean_mask[i]) ? lhs : rhs[i]

Exceptions
cudf::logic_errorif lhs and rhs are not of the same type
cudf::logic_errorif boolean mask is not of type bool
cudf::logic_errorif boolean mask is not of the same length as rhs
Parameters
[in]lhsleft-hand scalar
[in]rhsright-hand column_view
[in]boolean_maskcolumn of type_id::BOOL8 representing "left (true) / right (false)" boolean for each element. Null element represents false.
[in]mrDevice memory resource used to allocate the returned column's device memory
Returns
new column with the selected elements

◆ copy_if_else() [4/4]

std::unique_ptr<column> cudf::copy_if_else ( scalar const &  lhs,
scalar const &  rhs,
column_view const &  boolean_mask,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a new column, where each element is selected from either lhs or rhs based on the value of the corresponding element in boolean_mask.

Selects each element i in the output column from either rhs or lhs using the following rule: output[i] = (boolean_mask.valid(i) and boolean_mask[i]) ? lhs : rhs

Exceptions
cudf::logic_errorif boolean mask is not of type bool
Parameters
[in]lhsleft-hand scalar
[in]rhsright-hand scalar
[in]boolean_maskcolumn of type_id::BOOL8 representing "left (true) / right (false)" boolean for each element. null element represents false.
[in]mrDevice memory resource used to allocate the returned column's device memory
Returns
new column with the selected elements

◆ copy_range()

std::unique_ptr<column> cudf::copy_range ( column_view const &  source,
column_view const &  target,
size_type  source_begin,
size_type  source_end,
size_type  target_begin,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Copies a range of elements out-of-place from one column to another.

Creates a new column as if an in-place copy was performed into target. A copy of target is created first and then the elements indicated by the indices [target_begin, target_begin + N) were copied from the elements indicated by the indices [source_begin, source_end) of source (where N = (source_end - source_begin)). Elements outside the range are copied from target into the returned new column target.

If source and target refer to the same elements and the ranges overlap, the behavior is undefined.

Exceptions
cudf::logic_errorfor invalid range (if source_begin > source_end, source_begin < 0, source_begin >= source.size(), source_end > source.size(), target_begin < 0, target_begin >= target.size(), or target_begin + (source_end - source_begin) > target.size()).
cudf::logic_errorif target and source have different types.
Parameters
sourceThe column to copy from inside the range.
targetThe column to copy from outside the range.
source_beginThe starting index of the source range (inclusive)
source_endThe index of the last element in the source range (exclusive)
target_beginThe starting index of the target range (inclusive)
mrDevice memory resource used to allocate the returned column's device memory.
Returns
std::unique_ptr<column> The result target column

◆ copy_range_in_place()

void cudf::copy_range_in_place ( column_view const &  source,
mutable_column_view target,
size_type  source_begin,
size_type  source_end,
size_type  target_begin 
)

Copies a range of elements in-place from one column to another.

Overwrites the range of elements in target indicated by the indices [target_begin, target_begin + N) with the elements from source indicated by the indices [source_begin, source_end) (where N = (source_end - source_begin)). Use the out-of-place copy function returning std::unique_ptr<column> for uses cases requiring memory reallocation. For example for strings columns and other variable-width types.

If source and target refer to the same elements and the ranges overlap, the behavior is undefined.

Exceptions
cudf::logic_errorif memory reallocation is required (e.g. for variable width types).
cudf::logic_errorfor invalid range (if source_begin > source_end, source_begin < 0, source_begin >= source.size(), source_end > source.size(), target_begin < 0, target_begin >= target.size(), or target_begin + (source_end - source_begin) > target.size()).
cudf::logic_errorif target and source have different types.
cudf::logic_errorif source has null values and target is not nullable.
Parameters
sourceThe column to copy from
targetThe preallocated column to copy into
source_beginThe starting index of the source range (inclusive)
source_endThe index of the last element in the source range (exclusive)
target_beginThe starting index of the target range (inclusive)

◆ empty_like() [1/2]

std::unique_ptr<column> cudf::empty_like ( column_view const &  input)

Initializes and returns an empty column of the same type as the input.

Parameters
[in]inputImmutable view of input column to emulate
Returns
std::unique_ptr<column> An empty column of same type as input

◆ empty_like() [2/2]

std::unique_ptr<table> cudf::empty_like ( table_view const &  input_table)

Creates a table of empty columns with the same types as the input_table

Creates the cudf::column objects, but does not allocate any underlying device memory for the column's data or bitmask.

Parameters
[in]input_tableImmutable view of input table to emulate
Returns
std::unique_ptr<table> A table of empty columns with the same types as the columns in input_table

◆ get_element()

std::unique_ptr<scalar> cudf::get_element ( column_view const &  input,
size_type  index,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Get the element at specified index from a column.

Warning
This function is expensive (invokes a kernel launch). So, it is not recommended to be used in performance sensitive code or inside a loop.
Exceptions
cudf::logic_errorif index is not within the range [0, input.size())
Parameters
inputColumn view to get the element from
indexIndex into input to get the element at
mrDevice memory resource used to allocate the returned scalar's device memory.
Returns
std::unique_ptr<scalar> Scalar containing the single value

◆ pack()

Deep-copy a table_view into a serialized contiguous memory format.

The metadata from the table_view is copied into a host vector of bytes and the data from the table_view is copied into a device_buffer. Pass the output of this function into cudf::unpack to deserialize.

Parameters
inputView of the table to pack
[in]mrOptional, The resource to use for all returned device allocations
Returns
packed_columns A struct containing the serialized metadata and data in contiguous host and device memory respectively

◆ pack_metadata()

packed_columns::metadata cudf::pack_metadata ( table_view const &  table,
uint8_t const *  contiguous_buffer,
size_t  buffer_size 
)

Produce the metadata used for packing a table stored in a contiguous buffer.

The metadata from the table_view is copied into a host vector of bytes which can be used to construct a packed_columns or packed_table structure. The caller is responsible for guaranteeing that that all of the columns in the table point into contiguous_buffer.

Parameters
inputView of the table to pack
contgiuous_bufferA contiguous buffer of device memory which contains the data referenced by the columns in table
buffer_sizeThe size of contiguous_buffer.
Returns
Vector of bytes representing the metadata used to unpack a packed_columns struct.

◆ sample()

std::unique_ptr<table> cudf::sample ( table_view const &  input,
size_type const  n,
sample_with_replacement  replacement = sample_with_replacement::FALSE,
int64_t const  seed = 0,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Gather n samples from given input randomly.

Example:
input: {col1: {1, 2, 3, 4, 5}, col2: {6, 7, 8, 9, 10}}
n: 3
replacement: false
output: {col1: {3, 1, 4}, col2: {8, 6, 9}}
replacement: true
output: {col1: {3, 1, 1}, col2: {8, 6, 6}}
Exceptions
cudf::logic_errorif n > input.num_rows() and replacement == FALSE.
cudf::logic_errorif n < 0.
Parameters
inputView of a table to sample.
nnon-negative number of samples expected from input.
replacementAllow or disallow sampling of the same row more than once.
seedSeed value to initiate random number generator.
mrDevice memory resource used to allocate the returned table's device memory
Returns
std::unique_ptr<table> Table containing samples from input

◆ unpack() [1/2]

table_view cudf::unpack ( packed_columns const &  input)

Deserialize the result of cudf::pack

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in input.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters
inputThe packed columns to unpack
Returns
The unpacked table_view

◆ unpack() [2/2]

table_view cudf::unpack ( uint8_t const *  metadata,
uint8_t const *  gpu_data 
)

Deserialize the result of cudf::pack

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in gpu_data using the metadata contained in the host buffer metadata.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters
metadataThe host-side metadata buffer resulting from the initial pack() call
gpu_dataThe device-side contiguous buffer storing the data that will be referenced by the resulting table_view
Returns
The unpacked table_view