transform#

pylibcudf.transform.bools_to_mask(Column input) tuple#

Create a bitmask from a column of boolean elements

Parameters:
inputColumn

Column to produce new mask from.

Returns:
tuple[gpumemoryview, int]

Two-tuple of a gpumemoryview wrapping the bitmask and the null count.

pylibcudf.transform.encode(Table input) tuple#

Encode the rows of the given table as integers.

Parameters:
inputTable

Table containing values to be encoded

Returns:
tuple[Table, Column]

The distinct row of the input table in sorted order, and a column of integer indices representing the encoded rows.

pylibcudf.transform.mask_to_bools(Py_ssize_t bitmask, int begin_bit, int end_bit) Column#

Creates a boolean column from given bitmask.

Parameters:
bitmaskint

Pointer to the bitmask which needs to be converted

begin_bitint

Position of the bit from which the conversion should start

end_bitint

Position of the bit before which the conversion should stop

Returns:
Column

Boolean column of the bitmask from [begin_bit, end_bit]

pylibcudf.transform.nans_to_nulls(Column input) tuple#

Create a null mask preserving existing nulls and converting nans to null.

For details, see nans_to_nulls().

Parameters:
inputColumn

Column to produce new mask from.

Returns:
Two-tuple of a gpumemoryview wrapping the null mask and the new null count.
pylibcudf.transform.one_hot_encode(Column input, Column categories) Table#

Encodes input by generating a new column for each value in categories indicating the presence of that value in input.

Parameters:
inputColumn

Column containing values to be encoded.

categoriesColumn

Column containing categories

Returns:
Column

A table of the encoded values.

pylibcudf.transform.transform(Column input, unicode unary_udf, DataType output_type, bool is_ptx) Column#
Create a new column by applying a unary function against every

element of an input column.

Parameters:
inputColumn

Column to transform.

unary_udfstr

The PTX/CUDA string of the unary function to apply.

output_typeDataType

The output type that is compatible with the output type in the unary_udf.

is_ptxbool

If True, the UDF is treated as PTX code. If False, the UDF is treated as CUDA code.

Returns:
Column

The transformed column having the UDF applied to each element.