Lists Filling#

group lists_filling

Functions

std::unique_ptr<column> sequences(column_view const &starts, column_view const &sizes, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#

Create a lists column in which each row contains a sequence of values specified by a tuple of (start, size) parameters.

Create a lists column in which each row is a sequence of values starting from a start value, incrementing by one, and its cardinality is specified by a size value. The start and size values used to generate each list is taken from the corresponding row of the input starts and sizes columns.

  • sizes must be a column of integer types.

  • All the input columns must not have nulls.

  • If any row of the sizes column contains negative value, the output is undefined.

starts = [0, 1, 2, 3, 4]
sizes  = [0, 2, 2, 1, 3]

output = [ [], [1, 2], [2, 3], [3], [4, 5, 6] ]
Throws:
  • cudf::logic_error – if sizes column is not of integer types.

  • cudf::logic_error – if any input column has nulls.

  • cudf::logic_error – if starts and sizes columns do not have the same size.

  • std::overflow_error – if the output column would exceed the column size limit.

Parameters:
  • starts – First values in the result sequences.

  • sizes – Numbers of values in the result sequences.

  • stream – CUDA stream used for device memory operations and kernel launches.

  • mr – Device memory resource used to allocate the returned column’s device memory.

Returns:

The result column containing generated sequences.

std::unique_ptr<column> sequences(column_view const &starts, column_view const &steps, column_view const &sizes, rmm::cuda_stream_view stream = cudf::get_default_stream(), rmm::device_async_resource_ref mr = rmm::mr::get_current_device_resource())#

Create a lists column in which each row contains a sequence of values specified by a tuple of (start, step, size) parameters.

Create a lists column in which each row is a sequence of values starting from a start value, incrementing by a step value, and its cardinality is specified by a size value. The values start, step, and size used to generate each list is taken from the corresponding row of the input starts, steps, and sizes columns.

  • sizes must be a column of integer types.

  • starts and steps columns must have the same type.

  • All the input columns must not have nulls.

  • If any row of the sizes column contains negative value, the output is undefined.

starts = [0, 1, 2, 3, 4]
steps  = [2, 1, 1, 1, -3]
sizes  = [0, 2, 2, 1, 3]

output = [ [], [1, 2], [2, 3], [3], [4, 1, -2] ]
Throws:
  • cudf::logic_error – if sizes column is not of integer types.

  • cudf::logic_error – if any input column has nulls.

  • cudf::logic_error – if starts and steps columns have different types.

  • cudf::logic_error – if starts, steps, and sizes columns do not have the same size.

  • std::overflow_error – if the output column would exceed the column size limit.

Parameters:
  • starts – First values in the result sequences.

  • steps – Increment values for the result sequences.

  • sizes – Numbers of values in the result sequences.

  • stream – CUDA stream used for device memory operations and kernel launches.

  • mr – Device memory resource used to allocate the returned column’s device memory.

Returns:

The result column containing generated sequences.