Files
file	contiguous_split.hpp
	Table APIs for contiguous_split, pack, unpack, and metadata.

file	contiguous_split.hpp
	Table APIs for contiguous_split, pack, unpack, and metadata.

file	copying.hpp
	Column APIs for gather, scatter, split, slice, etc.

Classes
struct	cudf::packed_columns
	Column data in a serialized format. More...

struct	cudf::packed_table
	The result(s) of a cudf::contiguous_split. More...

class	cudf::chunked_pack
	Perform a chunked "pack" operation of the input `table_view` using a user provided buffer of size `user_buffer_size`. More...

Functions
std::vector< packed_table >	cudf::contiguous_split (cudf::table_view const &input, std::vector< size_type > const &splits, rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Performs a deep-copy split of a `table_view` into a vector of `packed_table` where each `packed_table` is using a single contiguous block of memory for all of the split's column data. More...

packed_columns	cudf::pack (cudf::table_view const &input, rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Deep-copy a `table_view` into a serialized contiguous memory format. More...

std::vector< uint8_t >	cudf::pack_metadata (table_view const &table, uint8_t const *contiguous_buffer, size_t buffer_size)
	Produce the metadata used for packing a table stored in a contiguous buffer. More...

table_view	cudf::unpack (packed_columns const &input)
	Deserialize the result of `cudf::pack`. More...

table_view	cudf::unpack (uint8_t const metadata, uint8_t const gpu_data)
	Deserialize the result of `cudf::pack`. More...

std::vector< column_view >	cudf::split (column_view const &input, host_span< size_type const > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `column_view` into a set of `column_view`s according to a set of indices derived from expected splits. More...

std::vector< column_view >	cudf::split (column_view const &input, std::initializer_list< size_type > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `column_view` into a set of `column_view`s according to a set of indices derived from expected splits. More...

std::vector< table_view >	cudf::split (table_view const &input, host_span< size_type const > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `table_view` into a set of `table_view`s according to a set of indices derived from expected splits. More...

std::vector< table_view >	cudf::split (table_view const &input, std::initializer_list< size_type > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `table_view` into a set of `table_view`s according to a set of indices derived from expected splits. More...

Detailed Description

Function Documentation

◆ contiguous_split()

std::vector<packed_table> cudf::contiguous_split	(	cudf::table_view const &	input,
		std::vector< size_type > const &	splits,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Performs a deep-copy split of a table_view into a vector of packed_table where each packed_table is using a single contiguous block of memory for all of the split's column data.

The memory for the output views is allocated in a single contiguous rmm::device_buffer returned in the packed_table. There is no top-level owning table.

The returned views of input are constructed from a vector of indices, that indicate where each split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size(). For a splits size N, there will always be N+1 splits in the output.

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory contained in the all_data field of the returned packed_table.

Example:
input:   [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
          {50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits:  {2, 5, 9}
output:  [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
          {{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of a table to split
splits	A vector of indices where the view will be split
mr	An optional memory resource to use for all returned device allocations

Returns: The set of requested views of input indicated by the splits and the viewed memory buffer

◆ pack()

packed_columns cudf::pack	(	cudf::table_view const &	input,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Deep-copy a table_view into a serialized contiguous memory format.

The metadata from the table_view is copied into a host vector of bytes and the data from the table_view is copied into a device_buffer. Pass the output of this function into cudf::unpack to deserialize.

Parameters

input	View of the table to pack
mr	An optional memory resource to use for all returned device allocations

Returns: packed_columns A struct containing the serialized metadata and data in contiguous host and device memory respectively

◆ pack_metadata()

std::vector<uint8_t> cudf::pack_metadata	(	table_view const &	table,
		uint8_t const *	contiguous_buffer,
		size_t	buffer_size
	)

Produce the metadata used for packing a table stored in a contiguous buffer.

The metadata from the table_view is copied into a host vector of bytes which can be used to construct a packed_columns or packed_table structure. The caller is responsible for guaranteeing that all of the columns in the table point into contiguous_buffer.

Parameters

table	View of the table to pack
contiguous_buffer	A contiguous buffer of device memory which contains the data referenced by the columns in `table`
buffer_size	The size of `contiguous_buffer`

Returns: Vector of bytes representing the metadata used to unpack a packed_columns struct

◆ split() [1/4]

std::vector<column_view> cudf::split	(	column_view const &	input,
		host_span< size_type const >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a column_view into a set of column_views according to a set of indices derived from expected splits.

The returned view's of input are constructed from vector of splits, which indicates where the split should occur. The ith returned column_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   {10, 12, 14, 16, 18, 20, 22, 24, 26, 28}
splits:  {2, 5, 9}
output:  {{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}}

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of column to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ split() [2/4]

std::vector<column_view> cudf::split	(	column_view const &	input,
		std::initializer_list< size_type >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a column_view into a set of column_views according to a set of indices derived from expected splits.

The returned view's of input are constructed from vector of splits, which indicates where the split should occur. The ith returned column_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   {10, 12, 14, 16, 18, 20, 22, 24, 26, 28}
splits:  {2, 5, 9}
output:  {{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}}

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of column to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ split() [3/4]

std::vector<table_view> cudf::split	(	table_view const &	input,
		host_span< size_type const >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a table_view into a set of table_views according to a set of indices derived from expected splits.

The returned views of input are constructed from vector of splits, which indicates where the split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
          {50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits:  {2, 5, 9}
output:  [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
          {{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of a table to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ split() [4/4]

std::vector<table_view> cudf::split	(	table_view const &	input,
		std::initializer_list< size_type >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a table_view into a set of table_views according to a set of indices derived from expected splits.

The returned views of input are constructed from vector of splits, which indicates where the split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
          {50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits:  {2, 5, 9}
output:  [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
          {{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of a table to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ unpack() [1/2]

table_view cudf::unpack ( packed_columns const & input )

Deserialize the result of cudf::pack.

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in input.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters

input The packed columns to unpack

Returns: The unpacked table_view

◆ unpack() [2/2]

table_view cudf::unpack	(	uint8_t const *	metadata,
		uint8_t const *	gpu_data
	)

Deserialize the result of cudf::pack.

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in gpu_data using the metadata contained in the host buffer metadata.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters

metadata	The host-side metadata buffer resulting from the initial pack() call
gpu_data	The device-side contiguous buffer storing the data that will be referenced by the resulting `table_view`

Returns: The unpacked table_view

Files

Classes

Functions

Detailed Description

Function Documentation

◆ contiguous_split()

◆ pack()

◆ pack_metadata()

◆ split() [1/4]

◆ split() [2/4]

◆ split() [3/4]

◆ split() [4/4]

◆ unpack() [1/2]

◆ unpack() [2/2]