A unit of table data in a streaming pipeline. More...
#include <table_chunk.hpp>
Public Types | |
| enum class | ExclusiveView : bool { NO , YES } |
| Indicates whether the TableChunk holds an exclusive or shared view of the underlying table data. More... | |
Public Member Functions | |
| TableChunk (std::unique_ptr< cudf::table > table, rmm::cuda_stream_view stream) | |
| Construct a TableChunk from a device table. More... | |
| TableChunk (cudf::table_view table_view, std::size_t device_alloc_size, rmm::cuda_stream_view stream, OwningWrapper &&owner, ExclusiveView exclusive_view) | |
| Construct a TableChunk from a device table view. More... | |
| TableChunk (std::unique_ptr< cudf::packed_columns > packed_columns, rmm::cuda_stream_view stream) | |
| Construct a TableChunk from packed columns. More... | |
| TableChunk (std::unique_ptr< PackedData > packed_data) | |
| Construct a TableChunk from a packed data blob. More... | |
| TableChunk (TableChunk &&)=default | |
| TableChunk is moveable. | |
| TableChunk & | operator= (TableChunk &&)=default |
| Move assignment. More... | |
| TableChunk (TableChunk const &)=delete | |
| TableChunk & | operator= (TableChunk const &)=delete |
| rmm::cuda_stream_view | stream () const noexcept |
| Returns the CUDA stream on which this table chunk was created. More... | |
| std::size_t | data_alloc_size (MemoryType mem_type) const |
| Number of bytes allocated for the data in the specified memory type. More... | |
| bool | is_available () const noexcept |
| Indicates whether the underlying cudf table data is fully available in device memory. More... | |
| std::size_t | make_available_cost () const noexcept |
| Returns the estimated cost (in bytes) of making the table available. More... | |
| TableChunk | make_available (MemoryReservation &reservation) |
| Moves this table chunk into a new one with its cudf table made available. More... | |
| cudf::table_view | table_view () const |
| Returns a view of the underlying table. More... | |
| bool | is_spillable () const |
| Indicates whether this table chunk can be spilled to device memory. More... | |
| TableChunk | copy (MemoryReservation &reservation) const |
| Create a deep copy of the table chunk. More... | |
A unit of table data in a streaming pipeline.
Represents either an unpacked cudf::table, a cudf::packed_columns, or a PackedData.
TableChunks may be initially unavailable (e.g., if the data is packed or spilled), and can be made available (i.e., materialized to device memory) on demand.
Definition at line 36 of file table_chunk.hpp.
|
strong |
Indicates whether the TableChunk holds an exclusive or shared view of the underlying table data.
This boolean enum is used to explicitly express ownership semantics when constructing a TableChunk from a cudf::table_view.
ExclusiveView::YES: The TableChunk has exclusive ownership of the table's device memory and are considered spillable.ExclusiveView::NO: The TableChunk is a non-owning view of data managed elsewhere. The memory may be shared or externally owned, and the chunk is therefore not spillable. Definition at line 52 of file table_chunk.hpp.
| rapidsmpf::streaming::TableChunk::TableChunk | ( | std::unique_ptr< cudf::table > | table, |
| rmm::cuda_stream_view | stream | ||
| ) |
Construct a TableChunk from a device table.
| table | Device-resident table. |
| stream | The CUDA stream on which the table was created. |
| rapidsmpf::streaming::TableChunk::TableChunk | ( | cudf::table_view | table_view, |
| std::size_t | device_alloc_size, | ||
| rmm::cuda_stream_view | stream, | ||
| OwningWrapper && | owner, | ||
| ExclusiveView | exclusive_view | ||
| ) |
Construct a TableChunk from a device table view.
The TableChunk does not take ownership of the underlying data; instead, the provided owner object is kept alive for the lifetime of the TableChunk. The caller is responsible for ensuring that the underlying device memory referenced by table_view remains valid during this period.
This constructor is typically used when creating a TableChunk from Python, where owner is used to keep the corresponding Python object alive until the TableChunk is destroyed.
| table_view | Device-resident table view. |
| device_alloc_size | Number of bytes allocated in device memory. |
| stream | CUDA stream on which the table was created. |
| owner | Object owning the memory backing table_view. This object will be destroyed last when the TableChunk is destroyed or spilled. |
| exclusive_view | Specifies whether this TableChunk has exclusive ownership semantics over the underlying table data:
|
| rapidsmpf::streaming::TableChunk::TableChunk | ( | std::unique_ptr< cudf::packed_columns > | packed_columns, |
| rmm::cuda_stream_view | stream | ||
| ) |
Construct a TableChunk from packed columns.
| packed_columns | Serialized device table. |
| stream | The CUDA stream on which the packed_columns was created. |
| rapidsmpf::streaming::TableChunk::TableChunk | ( | std::unique_ptr< PackedData > | packed_data | ) |
Construct a TableChunk from a packed data blob.
The packed data's CUDA stream will be associated the new table chunk.
| packed_data | Serialized host/device data with metadata. |
| TableChunk rapidsmpf::streaming::TableChunk::copy | ( | MemoryReservation & | reservation | ) | const |
Create a deep copy of the table chunk.
Allocates new memory for all buffers in the table using the specified reservation, which determines the target memory type (e.g., host or device). As a consequence, the is_available() status may differ in the new copy. For example, copying an available table chunk from device to host memory will result in an unavailable copy.
| reservation | Memory reservation used to track and limit allocations. |
TableChunk instance containing copies of all buffers and metadata.| std::overflow_error | If the total allocation size exceeds the available reservation. |
| std::size_t rapidsmpf::streaming::TableChunk::data_alloc_size | ( | MemoryType | mem_type | ) | const |
Number of bytes allocated for the data in the specified memory type.
| mem_type | The memory type to query. |
|
noexcept |
Indicates whether the underlying cudf table data is fully available in device memory.
true if the table is already available; otherwise, false. | bool rapidsmpf::streaming::TableChunk::is_spillable | ( | ) | const |
Indicates whether this table chunk can be spilled to device memory.
A table chunk is considered spillable if it owns its underlying memory. This is true when it was created from one of the following:
cudf::table, cudf::packed_columns, or PackedData.cudf::table_view constructed with is_exclusive_view == true, indicating that the view is the sole representation of the underlying data and that its owner exclusively manages the table's memory.In contrast, chunks constructed from non-exclusive cudf::table_view instances are non-owning views of externally managed memory and therefore not spillable.
To spill a table chunk from device to host memory, first call copy() to create a host-side copy, then delete or overwrite the original device chunk. If is_spillable() == true, destroying the original device chunk will release the associated device memory.
true if the table chunk owns its memory and can be spilled; otherwise false. | TableChunk rapidsmpf::streaming::TableChunk::make_available | ( | MemoryReservation & | reservation | ) |
Moves this table chunk into a new one with its cudf table made available.
As part of the move, a copy or unpack may be performed, the associated CUDA stream is used.
| reservation | Memory reservation for allocations if needed. |
|
noexcept |
Returns the estimated cost (in bytes) of making the table available.
Currently, only device memory cost is tracked.
|
default |
Move assignment.
|
noexcept |
Returns the CUDA stream on which this table chunk was created.
| cudf::table_view rapidsmpf::streaming::TableChunk::table_view | ( | ) | const |
Returns a view of the underlying table.
The table must be available in device memory.
| std::invalid_argument | if is_available() == false. |