Files
file	contiguous_split.hpp
	Table APIs for contiguous_split, pack, unpack, and metadata.

file	contiguous_split.hpp
	Table APIs for contiguous_split, pack, unpack, and metadata.

file	copying.hpp
	Column APIs for gather, scatter, split, slice, etc.

Classes
struct	cudf::packed_columns
	Column data in a serialized format. More...

struct	cudf::packed_table
	The result(s) of a cudf::contiguous_split. More...

class	cudf::chunked_pack
	Perform a chunked "pack" operation of the input `table_view` using a user provided buffer of size `user_buffer_size`. More...

Functions
std::vector< packed_table >	cudf::contiguous_split (cudf::table_view const &input, std::vector< size_type > const &splits, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Performs a deep-copy split of a `table_view` into a vector of `packed_table` where each `packed_table` is using a single contiguous block of memory for all of the split's column data. More...

packed_columns	cudf::pack (cudf::table_view const &input, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Deep-copy a `table_view` into a serialized contiguous memory format. More...

std::vector< uint8_t >	cudf::pack_metadata (table_view const &table, uint8_t const *contiguous_buffer, size_t buffer_size)
	Produce the metadata used for packing a table stored in a contiguous buffer. More...

table_view	cudf::unpack (packed_columns const &input)
	Deserialize the result of `cudf::pack`. More...

table_view	cudf::unpack (uint8_t const metadata, uint8_t const gpu_data)
	Deserialize the result of `cudf::pack`. More...

std::vector< column_view >	cudf::split (column_view const &input, host_span< size_type const > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `column_view` into a set of `column_view`s according to a set of indices derived from expected splits. More...

std::vector< column_view >	cudf::split (column_view const &input, std::initializer_list< size_type > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `column_view` into a set of `column_view`s according to a set of indices derived from expected splits. More...

std::vector< table_view >	cudf::split (table_view const &input, host_span< size_type const > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `table_view` into a set of `table_view`s according to a set of indices derived from expected splits. More...

std::vector< table_view >	cudf::split (table_view const &input, std::initializer_list< size_type > splits, rmm::cuda_stream_view stream=cudf::get_default_stream())
	Splits a `table_view` into a set of `table_view`s according to a set of indices derived from expected splits. More...

Detailed Description

Function Documentation

◆ contiguous_split()

std::vector<packed_table> cudf::contiguous_split	(	cudf::table_view const &	input,
		std::vector< size_type > const &	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Performs a deep-copy split of a table_view into a vector of packed_table where each packed_table is using a single contiguous block of memory for all of the split's column data.

The memory for the output views is allocated in a single contiguous rmm::device_buffer returned in the packed_table. There is no top-level owning table.

The returned views of input are constructed from a vector of indices, that indicate where each split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size(). For a splits size N, there will always be N+1 splits in the output.

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory contained in the all_data field of the returned packed_table.

Example:
input:   [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
          {50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits:  {2, 5, 9}
output:  [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
          {{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of a table to split
splits	A vector of indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches
mr	An optional memory resource to use for all returned device allocations

Returns: The set of requested views of input indicated by the splits and the viewed memory buffer

◆ pack()

packed_columns cudf::pack	(	cudf::table_view const &	input,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Deep-copy a table_view into a serialized contiguous memory format.

The metadata from the table_view is copied into a host vector of bytes and the data from the table_view is copied into a device_buffer. Pass the output of this function into cudf::unpack to deserialize.

Parameters

input	View of the table to pack
stream	CUDA stream used for device memory operations and kernel launches
mr	An optional memory resource to use for all returned device allocations

Returns: packed_columns A struct containing the serialized metadata and data in contiguous host and device memory respectively

◆ pack_metadata()

std::vector<uint8_t> cudf::pack_metadata	(	table_view const &	table,
		uint8_t const *	contiguous_buffer,
		size_t	buffer_size
	)

Produce the metadata used for packing a table stored in a contiguous buffer.

The metadata from the table_view is copied into a host vector of bytes which can be used to construct a packed_columns or packed_table structure. The caller is responsible for guaranteeing that all of the columns in the table point into contiguous_buffer.

Parameters

table	View of the table to pack
contiguous_buffer	A contiguous buffer of device memory which contains the data referenced by the columns in `table`
buffer_size	The size of `contiguous_buffer`

Returns: Vector of bytes representing the metadata used to unpack a packed_columns struct

◆ split() [1/4]

std::vector<column_view> cudf::split	(	column_view const &	input,
		host_span< size_type const >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a column_view into a set of column_views according to a set of indices derived from expected splits.

The returned view's of input are constructed from vector of splits, which indicates where the split should occur. The ith returned column_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   {10, 12, 14, 16, 18, 20, 22, 24, 26, 28}
splits:  {2, 5, 9}
output:  {{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}}

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of column to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ split() [2/4]

std::vector<column_view> cudf::split	(	column_view const &	input,
		std::initializer_list< size_type >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a column_view into a set of column_views according to a set of indices derived from expected splits.

The returned view's of input are constructed from vector of splits, which indicates where the split should occur. The ith returned column_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   {10, 12, 14, 16, 18, 20, 22, 24, 26, 28}
splits:  {2, 5, 9}
output:  {{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}}

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of column to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ split() [3/4]

std::vector<table_view> cudf::split	(	table_view const &	input,
		host_span< size_type const >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a table_view into a set of table_views according to a set of indices derived from expected splits.

The returned views of input are constructed from vector of splits, which indicates where the split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
          {50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits:  {2, 5, 9}
output:  [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
          {{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of a table to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ split() [4/4]

std::vector<table_view> cudf::split	(	table_view const &	input,
		std::initializer_list< size_type >	splits,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`
	)

Splits a table_view into a set of table_views according to a set of indices derived from expected splits.

The returned views of input are constructed from vector of splits, which indicates where the split should occur. The ith returned table_view is sliced as [0, splits[i]) if i=0, else [splits[i], input.size()) if i is the last view and [splits[i-1], splits[i]] otherwise.

For all i it is expected splits[i] <= splits[i+1] <= input.size() For a splits size N, there will always be N+1 splits in the output

Note: It is the caller's responsibility to ensure that the returned views do not outlive the viewed device memory.

Example:
input:   [{10, 12, 14, 16, 18, 20, 22, 24, 26, 28},
          {50, 52, 54, 56, 58, 60, 62, 64, 66, 68}]
splits:  {2, 5, 9}
output:  [{{10, 12}, {14, 16, 18}, {20, 22, 24, 26}, {28}},
          {{50, 52}, {54, 56, 58}, {60, 62, 64, 66}, {68}}]

Exceptions

std::out_of_range	if `splits` has end index > size of `input`.
std::out_of_range	When the value in `splits` is not in the range [0, input.size()).
std::invalid_argument	When the values in the `splits` are 'strictly decreasing'.

Parameters

input	View of a table to split
splits	Indices where the view will be split
stream	CUDA stream used for device memory operations and kernel launches

Returns: The set of requested views of input indicated by the splits

◆ unpack() [1/2]

table_view cudf::unpack ( packed_columns const & input )

Deserialize the result of cudf::pack.

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in input.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters

input The packed columns to unpack

Returns: The unpacked table_view

◆ unpack() [2/2]

table_view cudf::unpack	(	uint8_t const *	metadata,
		uint8_t const *	gpu_data
	)

Deserialize the result of cudf::pack.

Converts the result of a serialized table into a table_view that points to the data stored in the contiguous device buffer contained in gpu_data using the metadata contained in the host buffer metadata.

It is the caller's responsibility to ensure that the table_view in the output does not outlive the data in the input.

No new device memory is allocated in this function.

Parameters

metadata	The host-side metadata buffer resulting from the initial pack() call
gpu_data	The device-side contiguous buffer storing the data that will be referenced by the resulting `table_view`

Returns: The unpacked table_view

Files

Classes

Functions

Detailed Description

Function Documentation

◆ contiguous_split()

◆ pack()

◆ pack_metadata()

◆ split() [1/4]

◆ split() [2/4]

◆ split() [3/4]

◆ split() [4/4]

◆ unpack() [1/2]

◆ unpack() [2/2]