Column Reshape#
- group column_reshape
Enums
Functions
-
std::unique_ptr<table> explode(table_view const &input_table, size_type explode_column_idx, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Explodes a list column’s elements.
Any list is exploded, which means the elements of the list in each row are expanded into new rows in the output. The corresponding rows for other columns in the input are duplicated. Example:
[[5,10,15], 100], [[20,25], 200], [[30], 300], returns [5, 100], [10, 100], [15, 100], [20, 200], [25, 200], [30, 300],
Nulls and empty lists propagate in different ways depending on what is null or empty.
Note that null lists are not included in the resulting table, but nulls inside lists and empty lists will be represented with a null entry for that column in that row.[[5,null,15], 100], [null, 200], [[], 300], returns [5, 100], [null, 100], [15, 100],
- Parameters:
input_table – Table to explode.
explode_column_idx – Column index to explode inside the table.
stream – CUDA stream used for device memory operations and kernel launches.
mr – Device memory resource used to allocate the returned column’s device memory.
- Returns:
A new table with explode_col exploded.
-
std::unique_ptr<table> explode_position(table_view const &input_table, size_type explode_column_idx, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Explodes a list column’s elements and includes a position column.
Any list is exploded, which means the elements of the list in each row are expanded into new rows in the output. The corresponding rows for other columns in the input are duplicated. A position column is added that has the index inside the original list for each row. Example:
[[5,10,15], 100], [[20,25], 200], [[30], 300], returns [0, 5, 100], [1, 10, 100], [2, 15, 100], [0, 20, 200], [1, 25, 200], [0, 30, 300],
Nulls and empty lists propagate in different ways depending on what is null or empty.
Note that null lists are not included in the resulting table, but nulls inside lists and empty lists will be represented with a null entry for that column in that row.[[5,null,15], 100], [null, 200], [[], 300], returns [0, 5, 100], [1, null, 100], [2, 15, 100],
- Parameters:
input_table – Table to explode.
explode_column_idx – Column index to explode inside the table.
stream – CUDA stream used for device memory operations and kernel launches.
mr – Device memory resource used to allocate the returned column’s device memory.
- Returns:
A new table with exploded value and position. The column order of return table is [cols before explode_input, explode_position, explode_value, cols after explode_input].
-
std::unique_ptr<table> explode_outer(table_view const &input_table, size_type explode_column_idx, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Explodes a list column’s elements retaining any null entries or empty lists inside.
Any list is exploded, which means the elements of the list in each row are expanded into new rows in the output. The corresponding rows for other columns in the input are duplicated. Example:
[[5,10,15], 100], [[20,25], 200], [[30], 300], returns [5, 100], [10, 100], [15, 100], [20, 200], [25, 200], [30, 300],
Nulls and empty lists propagate as null entries in the result.
[[5,null,15], 100], [null, 200], [[], 300], returns [5, 100], [null, 100], [15, 100], [null, 200], [null, 300],
- Parameters:
input_table – Table to explode.
explode_column_idx – Column index to explode inside the table.
stream – CUDA stream used for device memory operations and kernel launches.
mr – Device memory resource used to allocate the returned column’s device memory.
- Returns:
A new table with explode_col exploded.
-
std::unique_ptr<table> explode_outer_position(table_view const &input_table, size_type explode_column_idx, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Explodes a list column’s elements retaining any null entries or empty lists and includes a position column.
Any list is exploded, which means the elements of the list in each row are expanded into new rows in the output. The corresponding rows for other columns in the input are duplicated. A position column is added that has the index inside the original list for each row. Example:
[[5,10,15], 100], [[20,25], 200], [[30], 300], returns [0, 5, 100], [1, 10, 100], [2, 15, 100], [0, 20, 200], [1, 25, 200], [0, 30, 300],
Nulls and empty lists propagate as null entries in the result.
[[5,null,15], 100], [null, 200], [[], 300], returns [0, 5, 100], [1, null, 100], [2, 15, 100], [0, null, 200], [0, null, 300],
- Parameters:
input_table – Table to explode.
explode_column_idx – Column index to explode inside the table.
stream – CUDA stream used for device memory operations and kernel launches.
mr – Device memory resource used to allocate the returned column’s device memory.
- Returns:
A new table with explode_col exploded.
-
std::unique_ptr<column> interleave_columns(table_view const &input, rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Interleave columns of a table into a single column.
Converts the column major table
input
into a row major column. Example:in = [[A1, A2, A3], [B1, B2, B3]] return = [A1, B1, A2, B2, A3, B3]
- Throws:
cudf::logic_error – if input contains no columns.
cudf::logic_error – if input columns dtypes are not identical.
- Parameters:
input – [in] Table containing columns to interleave
mr – [in] Device memory resource used to allocate the returned column’s device memory
- Returns:
The interleaved columns as a single column
-
std::unique_ptr<table> tile(table_view const &input, size_type count, rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Repeats the rows from
input
tablecount
times to form a new table.output.num_columns() == input.num_columns()
output.num_rows() == input.num_rows() * count
input = [[8, 4, 7], [5, 2, 3]] count = 2 return = [[8, 4, 7, 8, 4, 7], [5, 2, 3, 5, 2, 3]]
- Parameters:
input – [in] Table containing rows to be repeated
count – [in] Number of times to tile “rows”. Must be non-negative
mr – [in] Device memory resource used to allocate the returned table’s device memory
- Returns:
The table containing the tiled “rows”
-
std::unique_ptr<column> byte_cast(column_view const &input_column, flip_endianness endian_configuration, rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#
Converts a column’s elements to lists of bytes.
input<int32> = [8675, 309] configuration = flip_endianness::YES return = [[0x00, 0x00, 0x21, 0xe3], [0x00, 0x00, 0x01, 0x35]]
- Parameters:
input_column – Column to be converted to lists of bytes
endian_configuration – Whether to retain or flip the endianness of the elements
mr – Device memory resource used to allocate the returned column’s device memory
- Returns:
The column containing the lists of bytes
-
std::unique_ptr<table> explode(table_view const &input_table, size_type explode_column_idx, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#