Copy Scatter#

group copy_scatter

Functions

std::unique_ptr<table> scatter(table_view const &source, column_view const &scatter_map, table_view const &target, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#

Scatters the rows of the source table into a copy of the target table according to a scatter map.

Scatters values from the source table into the target table out-of-place, returning a “destination table”. The scatter is performed according to a scatter map such that row scatter_map[i] of the destination table gets row i of the source table. All other rows of the destination table equal corresponding rows of the target table.

The number of columns in source must match the number of columns in target and their corresponding datatypes must be the same.

If the same index appears more than once in the scatter map, the result is undefined.

If any values in scatter_map are outside of the interval [-n, n) where n is the number of rows in the target table, behavior is undefined.

A negative value i in the scatter_map is interpreted as i+n, where n is the number of rows in the target table.

Throws:
  • std::invalid_argument – if the number of columns in source does not match the number of columns in target

  • std::invalid_argument – if the number of rows in source does not match the number of elements in scatter_map

  • cudf::data_type_error – if the data types of the source and target columns do not match

  • std::invalid_argument – if scatter_map contains null values

Parameters:
  • source – The input columns containing values to be scattered into the target columns

  • scatter_map – A non-nullable column of integral indices that maps the rows in the source table to rows in the target table. The size must be equal to or less than the number of elements in the source columns.

  • target – The set of columns into which values from the source_table are to be scattered

  • stream – CUDA stream used for device memory operations and kernel launches

  • mr – Device memory resource used to allocate the returned table’s device memory

Returns:

Result of scattering values from source to target

std::unique_ptr<table> scatter(std::vector<std::reference_wrapper<scalar const>> const &source, column_view const &indices, table_view const &target, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#

Scatters a row of scalar values into a copy of the target table according to a scatter map.

Scatters values from the source row into the target table out-of-place, returning a “destination table”. The scatter is performed according to a scatter map such that row scatter_map[i] of the destination table is replaced by the source row. All other rows of the destination table equal corresponding rows of the target table.

The number of elements in source must match the number of columns in target and their corresponding datatypes must be the same.

If the same index appears more than once in the scatter map, the result is undefined.

If any values in scatter_map are outside of the interval [-n, n) where n is the number of rows in the target table, behavior is undefined.

Throws:
  • std::invalid_argument – if the number of scalars does not match the number of columns in target

  • std::invalid_argument – if indices contains null values

  • cudf::data_type_error – if the data types of the scalars and target columns do not match

Parameters:
  • source – The input scalars containing values to be scattered into the target columns

  • indices – A non-nullable column of integral indices that indicate the rows in the target table to be replaced by source.

  • target – The set of columns into which values from the source_table are to be scattered

  • stream – CUDA stream used for device memory operations and kernel launches

  • mr – Device memory resource used to allocate the returned table’s device memory

Returns:

Result of scattering values from source to target

std::unique_ptr<table> boolean_mask_scatter(table_view const &input, table_view const &target, column_view const &boolean_mask, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#

Scatters rows from the input table to rows of the output corresponding to true values in a boolean mask.

The ith row of input will be written to the output table at the location of the ith true value in boolean_mask. All other rows in the output will equal the same row in target.

boolean_mask should have number of trues <= number of rows in input. If boolean mask is true, corresponding value in target is updated with value from corresponding input column, else it is left untouched.

Example:
input: {{1, 5, 6, 8, 9}}
boolean_mask: {true, false, false, false, true, true, false, true, true, false}
target:       {{   2,     2,     3,     4,    4,     7,    7,    7,    8,    10}}

output:       {{   1,     2,     3,     4,    5,     6,    7,    8,    9,    10}}
Throws:
  • std::invalid_argument – if input.num_columns() != target.num_columns()

  • cudf::data_type_error – if any ith input_column type != ith target_column type

  • cudf::data_type_error – if boolean_mask.type() != bool

  • std::invalid_argument – if boolean_mask.size() != target.num_rows()

  • std::invalid_argument – if number of true in boolean_mask > input.num_rows()

Parameters:
  • inputtable_view (set of dense columns) to scatter

  • targettable_view to modify with scattered values from input

  • boolean_maskcolumn_view which acts as boolean mask

  • stream – CUDA stream used for device memory operations and kernel launches

  • mr – Device memory resource used to allocate device memory of the returned table

Returns:

Returns a table by scattering input into target as per boolean_mask

std::unique_ptr<table> boolean_mask_scatter(std::vector<std::reference_wrapper<scalar const>> const &input, table_view const &target, column_view const &boolean_mask, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = cudf::get_current_device_resource_ref())#

Scatters scalar values to rows of the output corresponding to true values in a boolean mask.

The ith scalar in input will be written to the ith column of the output table at the location of every true value in boolean_mask. All other rows in the output will equal the same row in target.

Example:
input: {11}
boolean_mask: {true, false, false, false, true, true, false, true, true, false}
target:      {{   2,     2,     3,     4,    4,     7,    7,    7,    8,    10}}

output:       {{   11,    2,     3,     4,   11,    11,    7,   11,   11,    10}}
Throws:
  • std::invalid_argument – if input.size() != target.num_columns()

  • cudf::data_type_error – if any ith input_column type != ith target_column type

  • cudf::data_type_error – if boolean_mask.type() != bool

  • std::invalid_argument – if boolean_mask.size() != target.num_rows()

Parameters:
  • input – scalars to scatter

  • targettable_view to modify with scattered values from input

  • boolean_maskcolumn_view which acts as boolean mask

  • stream – CUDA stream used for device memory operations and kernel launches

  • mr – Device memory resource used to allocate device memory of the returned table

Returns:

Returns a table by scattering input into target as per boolean_mask