Column Merge#
- group column_merge
Functions
-
std::unique_ptr<cudf::table> merge(std::vector<table_view> const &tables_to_merge, std::vector<cudf::size_type> const &key_cols, std::vector<cudf::order> const &column_order, std::vector<cudf::null_order> const &null_precedence = {}, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#
Merge a set of sorted tables.
Merges sorted tables into one sorted table containing data from all tables. The key columns of each table must be sorted according to the parameters (cudf::column_order and cudf::null_order) specified for that column.
Example 1: input: table 1 => col 1 {0, 1, 2, 3} col 2 {4, 5, 6, 7} table 2 => col 1 {1, 2} col 2 {8, 9} table 3 => col 1 {2, 4} col 2 {8, 9} output: table => col 1 {0, 1, 1, 2, 2, 2, 3, 4} col 2 {4, 5, 8, 6, 8, 9, 7, 9}
Example 2: input: table 1 => col 0 {1, 0} col 1 {'c', 'b'} col 2 {RED, GREEN} table 2 => col 0 {1} col 1 {'a'} col 2 {NULL} with key_cols[] = {0,1} and asc_desc[] = {ASC, ASC}; Lex-sorting is on columns {0,1}; hence, lex-sorting of ((L0 x L1) V (R0 x R1)) is: (0,'b', GREEN), (1,'a', NULL), (1,'c', RED) (third column, the "color", just "goes along for the ride"; meaning it is permuted according to the data movements dictated by lexicographic ordering of columns 0 and 1) with result columns: Res0 = {0,1,1} Res1 = {'b', 'a', 'c'} Res2 = {GREEN, NULL, RED}
- Throws:
cudf::logic_error – if tables in
tables_to_merge
have different number of columnscudf::logic_error – if tables in
tables_to_merge
have columns with mismatched typescudf::logic_error – if
key_cols
is emptycudf::logic_error – if
key_cols
size is larger than the number of columns intables_to_merge
tablescudf::logic_error – if
key_cols
size andcolumn_order
size mismatches
- Parameters:
tables_to_merge – [in] Non-empty list of tables to be merged
key_cols – [in] Indices of left_cols and right_cols to be used for comparison criteria
column_order – [in] Sort order types of columns indexed by key_cols
null_precedence – [in] Array indicating the order of nulls with respect to non-nulls for the indexing columns (key_cols)
stream – CUDA stream used for device memory operations and kernel launches
mr – Device memory resource used to allocate the returned table’s device memory
- Returns:
A table containing sorted data from all input tables
-
std::unique_ptr<cudf::table> merge(std::vector<table_view> const &tables_to_merge, std::vector<cudf::size_type> const &key_cols, std::vector<cudf::order> const &column_order, std::vector<cudf::null_order> const &null_precedence = {}, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#