Copy Gather#

group copy_gather

Functions

std::unique_ptr<table> gather(table_view const &source_table, column_view const &gather_map, out_of_bounds_policy bounds_policy = out_of_bounds_policy::DONT_CHECK, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#

Gathers the specified rows (including null values) of a set of columns.

Gathers the rows of the source columns according to gather_map such that row “i” in the resulting table’s columns will contain row “gather_map[i]” from the source columns. The number of rows in the result table will be equal to the number of elements in gather_map.

A negative value i in the gather_map is interpreted as i+n, where n is the number of rows in the source_table.

For dictionary columns, the keys column component is copied and not trimmed if the gather results in abandoned key elements.

Throws:

std::invalid_argument – if gather_map contains null values.

Parameters:
  • source_table – The input columns whose rows will be gathered

  • gather_map – View into a non-nullable column of integral indices that maps the rows in the source columns to rows in the destination columns.

  • bounds_policy – Policy to apply to account for possible out-of-bounds indices DONT_CHECK skips all bounds checking for gather map values. NULLIFY coerces rows that corresponds to out-of-bounds indices in the gather map to be null elements. Callers should use DONT_CHECK when they are certain that the gather_map contains only valid indices for better performance. If policy is set to DONT_CHECK and there are out-of-bounds indices in the gather map, the behavior is undefined. Defaults to DONT_CHECK.

  • stream – CUDA stream used for device memory operations and kernel launches

  • mr – Device memory resource used to allocate the returned table’s device memory

Returns:

Result of the gather