Files | Functions

Files

file  merge.hpp
 

Functions

std::unique_ptr< cudf::tablecudf::merge (std::vector< table_view > const &tables_to_merge, std::vector< cudf::size_type > const &key_cols, std::vector< cudf::order > const &column_order, std::vector< cudf::null_order > const &null_precedence={}, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=rmm::mr::get_current_device_resource())
 Merge a set of sorted tables. More...
 

Detailed Description

Function Documentation

◆ merge()

std::unique_ptr<cudf::table> cudf::merge ( std::vector< table_view > const &  tables_to_merge,
std::vector< cudf::size_type > const &  key_cols,
std::vector< cudf::order > const &  column_order,
std::vector< cudf::null_order > const &  null_precedence = {},
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)

Merge a set of sorted tables.

Merges sorted tables into one sorted table containing data from all tables. The key columns of each table must be sorted according to the parameters (cudf::column_order and cudf::null_order) specified for that column.

Example 1:
input:
table 1 => col 1 {0, 1, 2, 3}
col 2 {4, 5, 6, 7}
table 2 => col 1 {1, 2}
col 2 {8, 9}
table 3 => col 1 {2, 4}
col 2 {8, 9}
output:
table => col 1 {0, 1, 1, 2, 2, 2, 3, 4}
col 2 {4, 5, 8, 6, 8, 9, 7, 9}
Example 2:
input:
table 1 => col 0 {1, 0}
col 1 {'c', 'b'}
col 2 {RED, GREEN}
table 2 => col 0 {1}
col 1 {'a'}
col 2 {NULL}
with key_cols[] = {0,1}
and asc_desc[] = {ASC, ASC};
Lex-sorting is on columns {0,1}; hence, lex-sorting of ((L0 x L1) V (R0 x R1)) is:
(0,'b', GREEN), (1,'a', NULL), (1,'c', RED)
(third column, the "color", just "goes along for the ride";
meaning it is permuted according to the data movements dictated
by lexicographic ordering of columns 0 and 1)
with result columns:
Res0 = {0,1,1}
Res1 = {'b', 'a', 'c'}
Res2 = {GREEN, NULL, RED}
Exceptions
cudf::logic_errorif tables in tables_to_merge have different number of columns
cudf::logic_errorif tables in tables_to_merge have columns with mismatched types
cudf::logic_errorif key_cols is empty
cudf::logic_errorif key_cols size is larger than the number of columns in tables_to_merge tables
cudf::logic_errorif key_cols size and column_order size mismatches
Parameters
[in]tables_to_mergeNon-empty list of tables to be merged
[in]key_colsIndices of left_cols and right_cols to be used for comparison criteria
[in]column_orderSort order types of columns indexed by key_cols
[in]null_precedenceArray indicating the order of nulls with respect to non-nulls for the indexing columns (key_cols)
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table's device memory
Returns
A table containing sorted data from all input tables