Files | Functions

Files

file  transform.hpp
 Column APIs for transforming rows.
 

Functions

std::unique_ptr< columncudf::transform (column_view const &input, std::string const &unary_udf, data_type output_type, bool is_ptx, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a new column by applying a unary function against every element of an input column. More...
 
std::pair< std::unique_ptr< rmm::device_buffer >, size_type > cudf::nans_to_nulls (column_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a null_mask from input by converting NaN to null and preserving existing null values and also returns new null_count. More...
 
std::pair< std::unique_ptr< rmm::device_buffer >, cudf::size_type > cudf::bools_to_mask (column_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a bitmask from a column of boolean elements. More...
 
std::pair< std::unique_ptr< cudf::table >, std::unique_ptr< cudf::column > > cudf::encode (cudf::table_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Encode the rows of the given table as integers. More...
 
std::unique_ptr< columncudf::mask_to_bools (bitmask_type const *bitmask, size_type begin_bit, size_type end_bit, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a boolean column from given bitmask. More...
 

Detailed Description

Function Documentation

◆ bools_to_mask()

std::pair<std::unique_ptr<rmm::device_buffer>, cudf::size_type> cudf::bools_to_mask ( column_view const &  input,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a bitmask from a column of boolean elements.

If element i in input is true, bit i in the resulting mask is set (1). Else, if element i is false or null, bit i is unset (0).

Exceptions
cudf::logic_errorif input.type() is a non-boolean type
Parameters
inputBoolean elements to convert to a bitmask.
mrDevice memory resource used to allocate the returned bitmask.
Returns
A pair containing a device_buffer with the new bitmask and it's null count obtained from input considering true represent valid/1 and false represent invalid/0.

◆ encode()

std::pair<std::unique_ptr<cudf::table>, std::unique_ptr<cudf::column> > cudf::encode ( cudf::table_view const &  input,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Encode the rows of the given table as integers.

The encoded values are integers in the range [0, n), where n is the number of distinct rows in the input table. The result table is such that keys[result[i]] == input[i], where keys is a table containing the distinct rows in input in sorted ascending order. Nulls, if any, are sorted to the end of the keys table.

Examples:

input: [{'a', 'b', 'b', 'a'}]
output: [{'a', 'b'}], {0, 1, 1, 0}
input: [{1, 3, 1, 2, 9}, {1, 2, 1, 3, 5}]
output: [{1, 2, 3, 9}, {1, 3, 2, 5}], {0, 2, 0, 1, 3}
Parameters
inputTable containing values to be encoded
mrDevice memory resource used to allocate the returned table's device memory
Returns
A pair containing the distinct row of the input table in sorter order, and a column of integer indices representing the encoded rows.

◆ mask_to_bools()

std::unique_ptr<column> cudf::mask_to_bools ( bitmask_type const *  bitmask,
size_type  begin_bit,
size_type  end_bit,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a boolean column from given bitmask.

Returns a bool for each bit in [begin_bit, end_bit). If bit i in least-significant bit numbering is set (1), then element i in the output is true, otherwise false.

Exceptions
cudf::logic_errorif bitmask is null and end_bit-begin_bit > 0
cudf::logic_errorif begin_bit > end_bit

Examples:

input: {0b10101010}
output: [{false, true, false, true, false, true, false, true}]
Parameters
bitmaskA device pointer to the bitmask which needs to be converted
begin_bitposition of the bit from which the conversion should start
end_bitposition of the bit before which the conversion should stop
mrDevice memory resource used to allocate the returned columns's device memory
Returns
A boolean column representing the given mask from [begin_bit, end_bit).

◆ nans_to_nulls()

std::pair<std::unique_ptr<rmm::device_buffer>, size_type> cudf::nans_to_nulls ( column_view const &  input,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a null_mask from input by converting NaN to null and preserving existing null values and also returns new null_count.

Exceptions
cudf::logic_errorif input.type() is a non-floating type
Parameters
inputAn immutable view of the input column of floating-point type
mrDevice memory resource used to allocate the returned bitmask.
Returns
A pair containing a device_buffer with the new bitmask and it's null count obtained by replacing NaN in input with null.

◆ transform()

std::unique_ptr<column> cudf::transform ( column_view const &  input,
std::string const &  unary_udf,
data_type  output_type,
bool  is_ptx,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a new column by applying a unary function against every element of an input column.

Computes: out[i] = F(in[i])

The output null mask is the same is the input null mask so if input[i] is null then output[i] is also null

Parameters
inputAn immutable view of the input column to transform
unary_udfThe PTX/CUDA string of the unary function to apply
outout_typeThe output type that is compatible with the output type in the UDF
is_ptxtrue: the UDF is treated as PTX code; false: the UDF is treated as CUDA code
mrDevice memory resource used to allocate the returned column's device memory
Returns
The column resulting from applying the unary function to every element of the input