Namespaces | Classes | Typedefs | Enumerations | Functions | Variables
rapidsmpf Namespace Reference

RAPIDS Multi-Processor interfaces. More...

Namespaces

 bootstrap
 
 coll
 Collective communication interfaces.
 
 communicator
 
 config
 
 detail
 
 mpi
 Collection of helpful MPI functions.
 
 shuffler
 Shuffler interfaces.
 
 streaming
 

Classes

class  Tag
 A tag used for identifying messages in a communication operation. More...
 
class  Communicator
 Abstract base class for a communication mechanism between nodes. More...
 
class  MPI
 MPI communicator class that implements the Communicator interface. More...
 
class  Single
 Single process communicator class that implements the Communicator interface. More...
 
class  CudaEvent
 RAII wrapper for a CUDA event with convenience methods. More...
 
struct  cuda_error
 Exception thrown when a CUDA error is encountered. More...
 
class  bad_alloc
 Exception thrown when a RapidsMPF allocation fails. More...
 
class  out_of_memory
 Exception thrown when RapidsMPF runs out of memory. More...
 
class  reservation_error
 Exception thrown when a memory reservation fails in RapidsMPF. More...
 
struct  BloomFilter
 A bloom filter, used for approximate set membership queries. More...
 
class  Buffer
 Buffer representing device or host memory. More...
 
class  BufferResource
 Class managing buffer resources. More...
 
class  LimitAvailableMemory
 A functor for querying the remaining available memory within a defined limit from an RMM statistics resource. More...
 
class  ContentDescription
 Description of an object's content. More...
 
class  HostBuffer
 Block of host memory. More...
 
class  HostMemoryResource
 Host memory resource using standard CPU allocation. More...
 
class  MemoryReservation
 Represents a reservation for future memory allocation. More...
 
struct  PackedData
 Bag of bytes with metadata suitable for sending over the wire. More...
 
struct  PinnedPoolProperties
 Properties for configuring a pinned memory pool. More...
 
class  PinnedMemoryResource
 Memory resource that provides pinned (page-locked) host memory using a pool. More...
 
struct  ScopedMemoryRecord
 Memory statistics for a specific scope. More...
 
class  SpillManager
 Manages memory spilling to free up device memory when needed. More...
 
class  OwningWrapper
 Utility class to store an arbitrary type-erased object while another object is alive. More...
 
class  ProgressThread
 A progress thread that can execute arbitrary functions. More...
 
class  RmmResourceAdaptor
 A RMM memory resource adaptor tailored to RapidsMPF. More...
 
class  Statistics
 Tracks statistics across rapidsmpf operations. More...
 
class  StreamOrderedTiming
 Stream-ordered wall-clock timer that records its result into Statistics. More...
 
struct  overloaded
 Helper for overloaded lambdas using std::visit. More...
 

Typedefs

using Rank = std::int32_t
 The rank of a node (e.g. the rank of a MPI process), or world size (total number of ranks). More...
 
using OpID = std::int32_t
 Operation ID defined by the user. This allows users to concurrently execute multiple operations, and each operation will be identified by its OpID. More...
 
using StageID = std::int32_t
 Identifier for a stage of a communication operation. More...
 
using Clock = std::chrono::high_resolution_clock
 Alias for high-resolution clock from the chrono library.
 
using Duration = std::chrono::duration< double >
 Alias for a duration type representing time in seconds as a double.
 
using TimePoint = std::chrono::time_point< Clock, Duration >
 Alias for a time point with double precision in seconds.
 

Enumerations

enum class  AllowOverbooking : bool { NO , YES }
 Policy controlling whether a memory reservation is allowed to overbook. More...
 
enum class  MemoryType : int { DEVICE = 0 , PINNED_HOST = 1 , HOST = 2 }
 Enum representing the type of memory sorted in decreasing order of preference. More...
 
enum class  TrimZeroFraction { NO , YES }
 Control whether a zero fractional part is omitted when formatting values. More...
 

Functions

std::ostream & operator<< (std::ostream &os, Communicator const &obj)
 Overloads the stream insertion operator for the Communicator class. More...
 
template<typename Range1 , typename Range2 >
void cuda_stream_join (Range1 const &downstreams, Range2 const &upstreams, CudaEvent *event=nullptr)
 Make downstream CUDA streams wait on upstream CUDA streams. More...
 
void cuda_stream_join (rmm::cuda_stream_view downstream, rmm::cuda_stream_view upstream, CudaEvent *event=nullptr)
 Make a downstream CUDA stream wait on an upstream CUDA stream. More...
 
std::pair< std::vector< cudf::table_view >, std::unique_ptr< cudf::table > > partition_and_split (cudf::table_view const &table, std::vector< cudf::size_type > const &columns_to_hash, int num_partitions, cudf::hash_id hash_function, std::uint32_t seed, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
 Partitions rows from the input table into multiple output tables. More...
 
std::unordered_map< shuffler::PartID, PackedDatapartition_and_pack (cudf::table_view const &table, std::vector< cudf::size_type > const &columns_to_hash, int num_partitions, cudf::hash_id hash_function, std::uint32_t seed, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
 Partitions rows from the input table into multiple packed (serialized) tables. More...
 
std::unordered_map< shuffler::PartID, PackedDatasplit_and_pack (cudf::table_view const &table, std::vector< cudf::size_type > const &splits, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
 Splits rows from the input table into multiple packed (serialized) tables. More...
 
std::unique_ptr< cudf::table > unpack_and_concat (std::vector< PackedData > &&partitions, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
 Unpack (deserialize) input partitions and concatenate them into a single table. More...
 
std::vector< PackedDataspill_partitions (std::vector< PackedData > &&partitions, BufferResource *br)
 Spill partitions from device memory to host memory. More...
 
std::vector< PackedDataunspill_partitions (std::vector< PackedData > &&partitions, BufferResource *br, AllowOverbooking allow_overbooking)
 Move spilled partitions (i.e., packed tables in host memory) back to device memory. More...
 
std::string str (cudf::column_view col, cudf::size_type index, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Converts the element at a specific index in a cudf::column_view to a string. More...
 
std::string str (cudf::column_view col, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Converts all elements in a cudf::column_view to a string. More...
 
std::string str (cudf::table_view tbl, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Converts all rows in a cudf::table_view to a string. More...
 
std::size_t estimated_memory_usage (cudf::column_view const &col, rmm::cuda_stream_view stream)
 Estimate the memory usage of a column. More...
 
std::size_t estimated_memory_usage (cudf::table_view const &tbl, rmm::cuda_stream_view stream)
 Estimate the memory usage of a table. More...
 
void buffer_copy (std::shared_ptr< Statistics > statistics, Buffer &dst, Buffer const &src, std::size_t size, std::ptrdiff_t dst_offset=0, std::ptrdiff_t src_offset=0)
 Asynchronously copy data between buffers. More...
 
std::unordered_map< MemoryType, BufferResource::MemoryAvailablememory_available_from_options (RmmResourceAdaptor *mr, config::Options options)
 Construct a map of memory-available functions from configuration options. More...
 
std::optional< Durationperiodic_spill_check_from_options (config::Options options)
 Get the periodic_spill_check parameter from configuration options. More...
 
std::shared_ptr< rmm::cuda_stream_poolstream_pool_from_options (config::Options options)
 Get a new CUDA stream pool from configuration options. More...
 
constexpr std::span< MemoryType const > leq_memory_types (MemoryType mem_type) noexcept
 Get the memory types with preference lower than or equal to mem_type. More...
 
constexpr char const * to_string (MemoryType mem_type)
 Get the name of a MemoryType. More...
 
std::ostream & operator<< (std::ostream &os, MemoryType mem_type)
 Overload to write type name to the output stream. More...
 
std::istream & operator>> (std::istream &is, MemoryType &out)
 Overload to read a MemoryType value from an input stream. More...
 
bool is_pinned_memory_resources_supported ()
 Checks if the PinnedMemoryResource is supported for the current CUDA version. More...
 
std::uint64_t get_total_host_memory () noexcept
 Get the total amount of system memory. More...
 
int get_current_numa_node () noexcept
 Get the NUMA node ID associated with the calling CPU thread. More...
 
std::vector< int > get_current_numa_nodes () noexcept
 Get current NUMA node(s) for memory binding. More...
 
std::uint64_t get_numa_node_host_memory (int numa_id=get_current_numa_node()) noexcept
 Get the total amount of host memory for a NUMA node. More...
 
template<typename MapType >
std::pair< typename MapType::key_type, typename MapType::mapped_type > extract_item (MapType &map, typename MapType::const_iterator position)
 Extracts a key-value pair from a map, removing it from the map. More...
 
template<typename MapType >
std::pair< typename MapType::key_type, typename MapType::mapped_type > extract_item (MapType &map, typename MapType::key_type const &key)
 Extracts a key-value pair from a map, removing it from the map. More...
 
template<typename MapType >
MapType::mapped_type extract_value (MapType &map, typename MapType::key_type const &key)
 Extracts the value associated with a specific key from a map, removing the key-value pair. More...
 
template<typename MapType >
MapType::mapped_type extract_value (MapType &map, typename MapType::const_iterator position)
 Extracts the value associated with a specific key from a map, removing the key-value pair. More...
 
template<typename MapType >
MapType::key_type extract_key (MapType &map, typename MapType::key_type const &key)
 Extracts a key from a map, removing the key-value pair. More...
 
template<typename MapType >
MapType::key_type extract_key (MapType &map, typename MapType::const_iterator position)
 Extracts a key from a map, removing the key-value pair. More...
 
template<typename MapType >
auto to_vector (MapType &&map)
 Converts a map-like associative container to a vector by moving the values and discarding the keys. More...
 
bool is_running_under_valgrind ()
 Checks whether the application is running under Valgrind. More...
 
template<typename T >
constexpr T safe_div (T x, T y)
 Performs safe division, returning 0 if the denominator is zero. More...
 
template<std::ranges::input_range R, typename T , typename Proj = std::identity>
constexpr bool contains (R &&range, T const &value, Proj proj={})
 Backport of std::ranges::contains from C++23 for C++20. More...
 
template<typename To , typename From >
requires std::is_arithmetic_v< To > &&constexpr std::is_arithmetic_v< From > To safe_cast (From value, std::source_location const &loc=std::source_location::current())
 Safely casts a numeric value to another type with overflow checking. More...
 
std::string trim (std::string_view text)
 Trims whitespace from both ends of the specified string. More...
 
std::string to_lower (std::string_view text)
 Converts the specified string to lowercase. More...
 
std::string to_upper (std::string_view text)
 Converts the specified string to uppercase. More...
 
std::string format_nbytes (double nbytes, int num_decimals=2, TrimZeroFraction trim_zero_fraction=TrimZeroFraction::YES)
 Format a byte count as a human-readable string using IEC units. More...
 
std::string format_duration (double seconds, int precision=2, TrimZeroFraction trim_zero_fraction=TrimZeroFraction::YES)
 Format a time duration as a human-readable string. More...
 
std::int64_t parse_nbytes (std::string_view text)
 Parse a human-readable byte count into an integer number of bytes. More...
 
std::size_t parse_nbytes_unsigned (std::string_view text)
 Parse a human-readable byte count into a non-negative number of bytes. More...
 
std::size_t parse_nbytes_or_percent (std::string_view text, double total_bytes)
 Parse a byte quantity or percentage into an absolute byte count. More...
 
Duration parse_duration (std::string_view text)
 Parse a human-readable time duration into seconds. More...
 
template<typename T >
parse_string (std::string const &text)
 Specialization of parse_string for boolean values. More...
 
template<>
bool parse_string (std::string const &text)
 Specialization of parse_string for boolean values. More...
 
std::optional< std::string > parse_optional (std::string text)
 Parse an optional string value. More...
 
std::vector< std::string > parse_string_list (std::string_view text, char delimiter=',')
 Parse a delimited string into a list of trimmed substrings. More...
 

Variables

constexpr bool COMM_HAVE_UCXX = false
 Whether RapidsMPF was built with the UCXX Communicator.
 
constexpr bool COMM_HAVE_MPI = false
 Whether RapidsMPF was built with the MPI Communicator.
 
constexpr std::array< MemoryType, 3 > MEMORY_TYPES
 All memory types sorted in decreasing order of preference. More...
 
constexpr std::array< char const *, MEMORY_TYPES.size()> MEMORY_TYPE_NAMES
 Memory type names sorted to match MemoryType and MEMORY_TYPES. More...
 
constexpr std::array< MemoryType, 2 > SPILL_TARGET_MEMORY_TYPES
 Memory types that are valid spill destinations in decreasing order of preference. More...
 

Detailed Description

RAPIDS Multi-Processor interfaces.

SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. SPDX-License-Identifier: Apache-2.0

SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. SPDX-License-Identifier: Apache-2.0

SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. SPDX-License-Identifier: Apache-2.0

SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. SPDX-License-Identifier: Apache-2.0

SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. SPDX-License-Identifier: Apache-2.0

Typedef Documentation

◆ OpID

Operation ID defined by the user. This allows users to concurrently execute multiple operations, and each operation will be identified by its OpID.

Note
Although typed as an int32, the number of distinct operations is limited to 2^20.

Definition at line 44 of file communicator.hpp.

◆ Rank

The rank of a node (e.g. the rank of a MPI process), or world size (total number of ranks).

Note
Ranks are always consecutive integers from zero to the total number of ranks.

Definition at line 34 of file communicator.hpp.

◆ StageID

Identifier for a stage of a communication operation.

Note
Although typed as an int32, the number of distinct stages is limited to 2^3.

Definition at line 53 of file communicator.hpp.

Enumeration Type Documentation

◆ AllowOverbooking

enum rapidsmpf::AllowOverbooking : bool
strong

Policy controlling whether a memory reservation is allowed to overbook.

This enum is used throughout RapidsMPF to specify the overbooking behavior of a memory reservation request. The exact semantics depend on the specific API and execution context in which it is used.

Enumerator
NO 

Overbooking is not allowed.

YES 

Overbooking is allowed.

Definition at line 39 of file buffer_resource.hpp.

◆ MemoryType

enum rapidsmpf::MemoryType : int
strong

Enum representing the type of memory sorted in decreasing order of preference.

Enumerator
DEVICE 

Device memory.

PINNED_HOST 

Pinned host memory.

HOST 

Host memory.

Definition at line 16 of file memory_type.hpp.

◆ TrimZeroFraction

Control whether a zero fractional part is omitted when formatting values.

Enumerator
NO 

Always keep the fractional part.

YES 

Omit the fractional part when it consists only of zeros.

Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

Definition at line 43 of file string.hpp.

Function Documentation

◆ buffer_copy()

void rapidsmpf::buffer_copy ( std::shared_ptr< Statistics statistics,
Buffer dst,
Buffer const &  src,
std::size_t  size,
std::ptrdiff_t  dst_offset = 0,
std::ptrdiff_t  src_offset = 0 
)

Asynchronously copy data between buffers.

Copies size bytes from src, starting at src_offset, into dst at dst_offset.

Parameters
statisticsStatistics object used to record the copy operation. Use Statistics::disabled() to skip recording.
dstDestination buffer.
srcSource buffer.
sizeNumber of bytes to copy.
dst_offsetByte offset into the destination buffer.
src_offsetByte offset into the source buffer.
Exceptions
std::invalid_argumentIf the requested range is out of bounds.

◆ contains()

template<std::ranges::input_range R, typename T , typename Proj = std::identity>
constexpr bool rapidsmpf::contains ( R &&  range,
T const &  value,
Proj  proj = {} 
)
constexpr

Backport of std::ranges::contains from C++23 for C++20.

Checks whether a range contains a given value.

Template Parameters
RAn input range type.
TThe type of the value to search for.
ProjA projection function applied to each element before comparison.
Parameters
rangeThe range to search.
valueThe value to search for in the range.
projThe projection to apply to each element before comparison.
Returns
true if any element in the range compares equal to value after projection, false otherwise.

Definition at line 281 of file misc.hpp.

◆ cuda_stream_join() [1/2]

template<typename Range1 , typename Range2 >
void rapidsmpf::cuda_stream_join ( Range1 const &  downstreams,
Range2 const &  upstreams,
CudaEvent event = nullptr 
)

Make downstream CUDA streams wait on upstream CUDA streams.

This call is asynchronous with respect to the host thread; no host-side blocking occurs.

Template Parameters
Range1Iterable whose elements are rmm::cuda_stream_view.
Range2Iterable whose elements are rmm::cuda_stream_view.
Parameters
downstreamsStreams that must not run ahead.
upstreamsStreams whose already-enqueued work must complete first.
eventOptional CUDA event used for synchronization. A unique event per call is not required; the same event may be reused. If nullptr, a temporary event is created internally. The reason to provide an event is to avoid the small overhead of constructing a temporary one.
Note
If all upstream and downstream streams are identical, this function is a no-op.

Definition at line 34 of file cuda_stream.hpp.

◆ cuda_stream_join() [2/2]

void rapidsmpf::cuda_stream_join ( rmm::cuda_stream_view  downstream,
rmm::cuda_stream_view  upstream,
CudaEvent event = nullptr 
)
inline

Make a downstream CUDA stream wait on an upstream CUDA stream.

This call is asynchronous with respect to the host thread; no host-side blocking occurs.

Equivalent to calling the range overload with one upstream and one downstream.

Parameters
downstreamStream that must not run ahead.
upstreamStream whose already-enqueued work must complete first.
eventOptional CUDA event used for synchronization. A unique event per call is not required; the same event may be reused. If nullptr, a temporary event is created internally to avoid the small overhead of constructing one per call site.
Note
If downstream and upstream are identical, this function is a no-op.
See also
cuda_stream_join(Range1 const&, Range2 const&, CudaEvent*)

Definition at line 91 of file cuda_stream.hpp.

◆ estimated_memory_usage() [1/2]

std::size_t rapidsmpf::estimated_memory_usage ( cudf::column_view const &  col,
rmm::cuda_stream_view  stream 
)

Estimate the memory usage of a column.

Parameters
colThe column to estimate the memory usage of.
streamCUDA stream used for device memory operations and kernel launches.
Returns
The estimated memory usage of the column.

◆ estimated_memory_usage() [2/2]

std::size_t rapidsmpf::estimated_memory_usage ( cudf::table_view const &  tbl,
rmm::cuda_stream_view  stream 
)

Estimate the memory usage of a table.

Parameters
tblThe table to estimate the memory usage of.
streamCUDA stream used for device memory operations and kernel launches.
Returns
The estimated memory usage of the table.

◆ extract_item() [1/2]

template<typename MapType >
std::pair<typename MapType::key_type, typename MapType::mapped_type> rapidsmpf::extract_item ( MapType &  map,
typename MapType::const_iterator  position 
)

Extracts a key-value pair from a map, removing it from the map.

Template Parameters
MapTypeThe type of the associative container.
Parameters
mapThe map from which to extract the key-value pair.
positionConst iterator pointing to a node in the map.
Returns
A pair containing the extracted key and value.
Note
Invalidates any iterators to the extracted element (notably position).
Exceptions
std::out_of_rangeIf the iterator is not found in the map.

Definition at line 50 of file misc.hpp.

◆ extract_item() [2/2]

template<typename MapType >
std::pair<typename MapType::key_type, typename MapType::mapped_type> rapidsmpf::extract_item ( MapType &  map,
typename MapType::key_type const &  key 
)

Extracts a key-value pair from a map, removing it from the map.

Template Parameters
MapTypeThe type of the associative container.
Parameters
mapThe map from which to extract the key-value pair.
keyThe key to extract.
Returns
A pair containing the extracted key and value.
Exceptions
std::out_of_rangeIf the key is not found in the map.

Definition at line 71 of file misc.hpp.

◆ extract_key() [1/2]

template<typename MapType >
MapType::key_type rapidsmpf::extract_key ( MapType &  map,
typename MapType::const_iterator  position 
)

Extracts a key from a map, removing the key-value pair.

Template Parameters
MapTypeThe type of the associative container.
Parameters
mapThe map from which to extract the key.
positionConst iterator pointing to a node in the map.
Returns
The extracted key.
Note
Invalidates any iterators to the extracted element (notably position).
Exceptions
std::out_of_rangeIf the key is not found in the map.

Definition at line 149 of file misc.hpp.

◆ extract_key() [2/2]

template<typename MapType >
MapType::key_type rapidsmpf::extract_key ( MapType &  map,
typename MapType::key_type const &  key 
)

Extracts a key from a map, removing the key-value pair.

Template Parameters
MapTypeThe type of the associative container.
Parameters
mapThe map from which to extract the key.
keyThe key to extract.
Returns
The extracted key.
Exceptions
std::out_of_rangeIf the key is not found in the map.

Definition at line 130 of file misc.hpp.

◆ extract_value() [1/2]

template<typename MapType >
MapType::mapped_type rapidsmpf::extract_value ( MapType &  map,
typename MapType::const_iterator  position 
)

Extracts the value associated with a specific key from a map, removing the key-value pair.

Template Parameters
MapTypeThe type of the associative container.
Parameters
mapThe map from which to extract the value.
positionConst iterator pointing to a node in the map.
Returns
The extracted value.
Note
Invalidates any iterators to the extracted element (notably position).
Exceptions
std::out_of_rangeIf the key is not found in the map.

Definition at line 113 of file misc.hpp.

◆ extract_value() [2/2]

template<typename MapType >
MapType::mapped_type rapidsmpf::extract_value ( MapType &  map,
typename MapType::key_type const &  key 
)

Extracts the value associated with a specific key from a map, removing the key-value pair.

Template Parameters
MapTypeThe type of the associative container.
Parameters
mapThe map from which to extract the value.
keyThe key associated with the value to extract.
Returns
The extracted value.
Exceptions
std::out_of_rangeIf the key is not found in the map.

Definition at line 93 of file misc.hpp.

◆ format_duration()

std::string rapidsmpf::format_duration ( double  seconds,
int  precision = 2,
TrimZeroFraction  trim_zero_fraction = TrimZeroFraction::YES 
)

Format a time duration as a human-readable string.

Converts a duration given in seconds into a scaled string representation using common time units such as ns, us, ms, s, min, h, and d.

The duration is accepted as a double to support both fractional seconds and very large values without overflow.

Negative values are supported and are formatted with a leading minus sign, which is useful when representing signed time deltas.

Decimal formatting is controlled by precision. When trim_zero_fraction is set to TrimZeroFraction::YES, the fractional part is omitted entirely if all decimal digits are zero. Otherwise, the specified number of decimal places is preserved.

Parameters
secondsTime duration to format, in seconds.
precisionNumber of decimal places to include in the formatted value.
trim_zero_fractionWhether to omit the fractional part when it consists only of zeros.
Returns
Human-readable string representation of the time duration.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ format_nbytes()

std::string rapidsmpf::format_nbytes ( double  nbytes,
int  num_decimals = 2,
TrimZeroFraction  trim_zero_fraction = TrimZeroFraction::YES 
)

Format a byte count as a human-readable string using IEC units.

Converts an integer byte count into a scaled string representation using binary (base-1024) units such as KiB, MiB, and GiB.

Negative values are supported and are formatted with a leading minus sign, which is useful when representing signed byte deltas or accounting values.

Decimal formatting is controlled by precision. When trim_zero_fraction is set to TrimZeroFraction::YES, the fractional part is omitted entirely if all decimal digits are zero. Otherwise, the specified number of decimal places is preserved.

Examples:

  • 1024 bytes with 2 decimals → "1.00 KiB" or "1 KiB" (trimmed)
  • 1536 bytes with 2 decimals → "1.50 KiB"
Parameters
nbytesSigned number of bytes to format, provided as a double to support any integer magnitude.
num_decimalsNumber of decimal places to include in the formatted value.
trim_zero_fractionWhether to omit the fractional part when it consists only of zeros.
Returns
Human-readable string representation of the byte count.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ get_current_numa_node()

int rapidsmpf::get_current_numa_node ( )
noexcept

Get the NUMA node ID associated with the calling CPU thread.

A NUMA (Non-Uniform Memory Access) node represents a group of CPU cores and memory that have faster access to each other than to memory attached to other nodes. On NUMA systems, binding allocations and threads to the same NUMA node can significantly reduce memory access latency and improve bandwidth.

This function returns the NUMA node on which the calling thread is currently executing, as determined by the operating system's CPU and memory topology. The value can change if the thread migrates between CPUs.

If NUMA support is not available on the system or cannot be queried, the function returns 0, which corresponds to the single implicit NUMA node on non-NUMA systems.

Returns
The NUMA node ID of the calling thread, or 0 if NUMA is unavailable.

◆ get_current_numa_nodes()

std::vector<int> rapidsmpf::get_current_numa_nodes ( )
noexcept

Get current NUMA node(s) for memory binding.

Queries the NUMA node associated with the CPU on which the calling thread is currently executing. This is a best-effort approach and may not be accurate in all cases.

Since processes are typically scheduled on CPUs that are local to their memory, using the CPU's NUMA node (via numa_node_of_cpu) provides a reasonable approximation that works well in practice for topology-aware binding scenarios. This intentionally avoids querying the process memory binding policy programmatically.

If NUMA support is not available or the NUMA node cannot be determined, the function returns a vector containing a single element, 0, which corresponds to the single implicit NUMA node on non-NUMA systems.

Returns
Vector of NUMA node IDs associated with the calling thread.

◆ get_numa_node_host_memory()

std::uint64_t rapidsmpf::get_numa_node_host_memory ( int  numa_id = get_current_numa_node())
noexcept

Get the total amount of host memory for a NUMA node.

Parameters
numa_idNUMA node for which to query the total host memory. Defaults to the current NUMA node as returned by get_current_numa_node().
Note
If NUMA support is not available or the node size cannot be determined, this function falls back to returning the total host memory.
Returns
Total host memory of the NUMA node in bytes.

◆ get_total_host_memory()

std::uint64_t rapidsmpf::get_total_host_memory ( )
noexcept

Get the total amount of system memory.

Returns
Total host memory in bytes.
Note
On WSL and in containerized environments, the returned value reflects the memory visible to the Linux kernel instance, which may differ from the physical memory of the host.
Terminates the process if sysconf(_SC_PAGE_SIZE) or sysconf(_SC_PHYS_PAGES) fails.

◆ is_pinned_memory_resources_supported()

bool rapidsmpf::is_pinned_memory_resources_supported ( )
inline

Checks if the PinnedMemoryResource is supported for the current CUDA version.

RapidsMPF requires CUDA 12.6 or newer to support pinned memory resources.

Returns
True if the PinnedMemoryResource is supported for the current CUDA version, false otherwise.

Definition at line 43 of file pinned_memory_resource.hpp.

◆ is_running_under_valgrind()

bool rapidsmpf::is_running_under_valgrind ( )

Checks whether the application is running under Valgrind.

Returns
true if the application is running under Valgrind, false otherwise.

◆ leq_memory_types()

constexpr std::span<MemoryType const> rapidsmpf::leq_memory_types ( MemoryType  mem_type)
constexprnoexcept

Get the memory types with preference lower than or equal to mem_type.

The returned span reflects the predefined ordering used in MEMORY_TYPES, which lists memory types in decreasing order of preference.

Parameters
mem_typeThe memory type used as the starting point.
Returns
A span of memory types whose preference is lower than or equal to the given type.

Definition at line 54 of file memory_type.hpp.

◆ memory_available_from_options()

std::unordered_map<MemoryType, BufferResource::MemoryAvailable> rapidsmpf::memory_available_from_options ( RmmResourceAdaptor mr,
config::Options  options 
)

Construct a map of memory-available functions from configuration options.

Parameters
mrPointer to a memory resource adaptor.
optionsConfiguration options.
Returns
The map of memory-available functions.

◆ operator<<() [1/2]

std::ostream& rapidsmpf::operator<< ( std::ostream &  os,
Communicator const &  obj 
)
inline

Overloads the stream insertion operator for the Communicator class.

This function allows a description of a Communicator to be written to an output stream.

Parameters
osThe output stream to write to.
objThe object to write.
Returns
A reference to the modified output stream.

Definition at line 653 of file communicator.hpp.

◆ operator<<() [2/2]

std::ostream& rapidsmpf::operator<< ( std::ostream &  os,
MemoryType  mem_type 
)

Overload to write type name to the output stream.

Parameters
osThe output stream.
mem_typeThe memory type to write name of to the output stream.
Returns
The output stream.

◆ operator>>()

std::istream& rapidsmpf::operator>> ( std::istream &  is,
MemoryType out 
)

Overload to read a MemoryType value from an input stream.

Parsing is case-insensitive. Supported values are: "DEVICE", "PINNED_HOST", "PINNED", "PINNED-HOST", and "HOST".

If token extraction from the stream fails, the stream state is preserved. If extraction succeeds but the token does not represent a valid MemoryType, the stream failbit is set.

Parameters
isThe input stream.
outThe memory type read from the input stream.
Returns
The input stream.

◆ parse_duration()

Duration rapidsmpf::parse_duration ( std::string_view  text)

Parse a human-readable time duration into seconds.

Parses a numeric value followed by an optional time unit suffix and converts it to a Duration, which represents a time interval in seconds as a double.

Supported units:

  • Nanoseconds: ns
  • Microseconds: µs or us
  • Milliseconds: ms
  • Seconds: s
  • Minutes: m or min
  • Hours: h
  • Days: d

Units are case-insensitive. If no unit is provided, the value is interpreted as seconds.

The numeric portion may be specified using integer, decimal, or scientific notation (e.g. "1e3", "2.5E-2"). Negative values are supported.

Parameters
textTime duration string to parse.
Returns
Parsed duration in seconds.
Exceptions
std::invalid_argumentIf the string format is invalid or the unit is not recognized.
std::out_of_rangeIf the parsed value is not finite.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_nbytes()

std::int64_t rapidsmpf::parse_nbytes ( std::string_view  text)

Parse a human-readable byte count into an integer number of bytes.

Parses a numeric value followed by an optional unit suffix and converts it to a byte count. Both IEC (base-1024) and SI (base-1000) units are supported.

Supported units:

  • Bytes: B
  • IEC (base-1024): KiB, MiB, GiB, TiB, PiB, EiB, ZiB, YiB
  • SI (base-1000): KB, MB, GB, TB, PB, EB, ZB, YB

Units are case-insensitive. If no unit is provided, the value is interpreted as bytes.

The numeric portion may be specified using integer, decimal, or scientific notation (e.g. "1e6", "2.5E-3"). The final byte count is rounded to the nearest integer, with ties rounded away from zero.

Parameters
textByte count string to parse.
Returns
Parsed byte count in bytes.
Exceptions
std::invalid_argumentIf the string format is invalid or the unit is not recognized.
std::out_of_rangeIf the parsed value is not finite or the resulting byte count overflows a 64-bit signed integer.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_nbytes_or_percent()

std::size_t rapidsmpf::parse_nbytes_or_percent ( std::string_view  text,
double  total_bytes 
)

Parse a byte quantity or percentage into an absolute byte count.

The input may be a human-readable byte string (e.g. "1GiB", "512MB") or a percentage (e.g. "25%"). See parse_nbytes_unsigned for the exact parsing semantics of the numeric part.

If text ends with '', the numeric part is first parsed using parse_nbytes_unsigned, then interpreted as a percentage of total_bytes.

Otherwise, text is parsed as an absolute byte value and returned as-is.

Parameters
textInput string representing a byte quantity or percentage.
total_bytesTotal number of bytes used when text is a percentage. Must be positive.
Returns
Absolute number of bytes computed from text.
Exceptions
std::invalid_argumentIf the input format is invalid, the value is negative, or if total_bytes is not positive.
std::out_of_rangeIf the parsed or computed value exceeds the representable range.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_nbytes_unsigned()

std::size_t rapidsmpf::parse_nbytes_unsigned ( std::string_view  text)

Parse a human-readable byte count into a non-negative number of bytes.

Parses a numeric value followed by an optional unit suffix and converts it to a byte count. Both IEC (base-1024) and SI (base-1000) units are supported.

Supported units:

  • Bytes: B
  • IEC (base-1024): KiB, MiB, GiB, TiB, PiB, EiB, ZiB, YiB
  • SI (base-1000): KB, MB, GB, TB, PB, EB, ZB, YB

Units are case-insensitive. If no unit is provided, the value is interpreted as bytes.

The numeric portion may be specified using integer, decimal, or scientific notation (e.g. "1e6", "2.5E-3"). The final byte count is rounded to the nearest integer, with ties rounded away from zero.

Negative values are not permitted.

Parameters
textByte count string to parse.
Returns
Parsed byte count in bytes.
Exceptions
std::invalid_argumentIf the string format is invalid, the unit is not recognized, or the parsed value is negative.
std::out_of_rangeIf the parsed value is not finite or overflows std::size_t.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_optional()

std::optional<std::string> rapidsmpf::parse_optional ( std::string  text)

Parse an optional string value.

Returns std::nullopt if the input string represents a disabled value. Otherwise, the input string is returned unchanged.

Disabled values are matched case-insensitively and may include surrounding whitespace. Recognized values include: false, no, off, disable, disabled, none, n/a, and na.

Parameters
textInput string to parse.
Returns
std::optional<std::string> Parsed optional string.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_string() [1/2]

template<typename T >
T rapidsmpf::parse_string ( std::string const &  text)

Specialization of parse_string for boolean values.

Converts the input string to a boolean. This function handles common boolean representations such as true, false, on, off, yes, and no, as well as numeric representations (e.g., 0 or 1). The input is first checked for a numeric value using std::stoi; if that fails, it is lowercased and trimmed before matching against known textual representations.

Parameters
textString to convert to a boolean.
Returns
The corresponding boolean value.
Exceptions
std::invalid_argumentIf the string cannot be interpreted as a boolean.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

Definition at line 242 of file string.hpp.

◆ parse_string() [2/2]

template<>
bool rapidsmpf::parse_string ( std::string const &  text)

Specialization of parse_string for boolean values.

Converts the input string to a boolean. This function handles common boolean representations such as true, false, on, off, yes, and no, as well as numeric representations (e.g., 0 or 1). The input is first checked for a numeric value using std::stoi; if that fails, it is lowercased and trimmed before matching against known textual representations.

Parameters
textString to convert to a boolean.
Returns
The corresponding boolean value.
Exceptions
std::invalid_argumentIf the string cannot be interpreted as a boolean.

Definition at line 242 of file string.hpp.

◆ parse_string_list()

std::vector<std::string> rapidsmpf::parse_string_list ( std::string_view  text,
char  delimiter = ',' 
)

Parse a delimited string into a list of trimmed substrings.

Splits the input string by the specified delimiter and returns a vector of trimmed tokens. Leading and trailing whitespace is removed from each token.

If the input string is empty or contains only whitespace, an empty vector is returned.

Parameters
textInput string to parse.
delimiterCharacter to use as the delimiter. Defaults to comma.
Returns
Vector of trimmed strings.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ partition_and_pack()

std::unordered_map<shuffler::PartID, PackedData> rapidsmpf::partition_and_pack ( cudf::table_view const &  table,
std::vector< cudf::size_type > const &  columns_to_hash,
int  num_partitions,
cudf::hash_id  hash_function,
std::uint32_t  seed,
rmm::cuda_stream_view  stream,
BufferResource br,
AllowOverbooking  allow_overbooking = AllowOverbooking::YES 
)

Partitions rows from the input table into multiple packed (serialized) tables.

Parameters
tableThe table to partition.
columns_to_hashIndices of input columns to hash.
num_partitionsThe number of partitions to use.
hash_functionHash function to use.
seedSeed value to the hash function.
streamCUDA stream used for device memory operations and kernel launches.
brBuffer resource for memory allocations.
allow_overbookingIf true, allow overbooking (true by default) // TODO: disable this by default https://github.com/rapidsmpf/rapidsmpf/issues/449
Returns
A map of partition IDs and their packed tables.
Exceptions
std::out_of_rangeif index is columns_to_hash is invalid
See also
unpack_and_concat
cudf::hash_partition
cudf::pack

◆ partition_and_split()

std::pair<std::vector<cudf::table_view>, std::unique_ptr<cudf::table> > rapidsmpf::partition_and_split ( cudf::table_view const &  table,
std::vector< cudf::size_type > const &  columns_to_hash,
int  num_partitions,
cudf::hash_id  hash_function,
std::uint32_t  seed,
rmm::cuda_stream_view  stream,
BufferResource br,
AllowOverbooking  allow_overbooking = AllowOverbooking::YES 
)

Partitions rows from the input table into multiple output tables.

Parameters
tableThe table to partition.
columns_to_hashIndices of input columns to hash.
num_partitionsThe number of partitions.
hash_functionHash function to use.
seedSeed value to the hash function.
streamCUDA stream used for device memory operations and kernel launches.
brBuffer resource for memory allocations.
allow_overbookingIf true, allow overbooking (true by default)
Returns
A vector of each partition and a table that owns the device memory.
Exceptions
std::out_of_rangeif index is columns_to_hash is invalid
See also
cudf::hash_partition
cudf::split

◆ periodic_spill_check_from_options()

std::optional<Duration> rapidsmpf::periodic_spill_check_from_options ( config::Options  options)

Get the periodic_spill_check parameter from configuration options.

Parameters
optionsConfiguration options.
Returns
The duration of the pause between spill checks or std::nullopt if no dedicated thread should check for spilling.

◆ safe_cast()

template<typename To , typename From >
requires std::is_arithmetic_v<To>&& constexpr std::is_arithmetic_v<From> To rapidsmpf::safe_cast ( From  value,
std::source_location const &  loc = std::source_location::current() 
)
constexpr

Safely casts a numeric value to another type with overflow checking.

For integral conversions, the value must be representable in the destination type or an exception is thrown.

For conversions involving floating point types, overflow and underflow follow standard floating point semantics. The result may become inf or -inf, or lose precision, without throwing.

Template Parameters
ToThe destination type.
FromThe source type.
Parameters
valueThe value to cast.
locSource location (automatically captured).
Returns
To The safely cast value.
Exceptions
std::overflow_errorif an integral value cannot be represented in the destination type.

Definition at line 311 of file misc.hpp.

◆ safe_div()

template<typename T >
constexpr T rapidsmpf::safe_div ( x,
y 
)
constexpr

Performs safe division, returning 0 if the denominator is zero.

Template Parameters
TThe numeric type of the operands.
Parameters
xThe numerator.
yThe denominator.
Returns
T The result of x / y, or 0 if y is zero.

Definition at line 192 of file misc.hpp.

◆ spill_partitions()

std::vector<PackedData> rapidsmpf::spill_partitions ( std::vector< PackedData > &&  partitions,
BufferResource br 
)

Spill partitions from device memory to host memory.

Moves the buffer of each PackedData from device memory to host memory using the provided buffer resource and the buffer's CUDA stream. Partitions that are already in host memory are passed through unchanged.

For device-resident partitions, a host memory reservation is made before moving the buffer. If the reservation fails due to insufficient host memory, an exception is thrown. Overbooking is not allowed.

Parameters
partitionsThe partitions to spill.
brBuffer resource used to reserve host memory and perform the move.
Returns
A vector of PackedData, where each buffer resides in host memory.
Exceptions
rapidsmpf::reservation_errorIf host memory reservation fails.

◆ split_and_pack()

std::unordered_map<shuffler::PartID, PackedData> rapidsmpf::split_and_pack ( cudf::table_view const &  table,
std::vector< cudf::size_type > const &  splits,
rmm::cuda_stream_view  stream,
BufferResource br,
AllowOverbooking  allow_overbooking = AllowOverbooking::YES 
)

Splits rows from the input table into multiple packed (serialized) tables.

Parameters
tableThe table to split and pack into partitions.
splitsThe split points, equivalent to cudf::split(), i.e. one less than the number of result partitions.
streamCUDA stream used for device memory operations and kernel launches.
brBuffer resource for memory allocations.
allow_overbookingIf true, allow overbooking (true by default) // TODO: disable this by default https://github.com/rapidsmpf/rapidsmpf/issues/449
Returns
A map of partition IDs and their packed tables.
Exceptions
std::out_of_rangeif the splits are invalid.
See also
unpack_and_concat
cudf::split
partition_and_pack

◆ str() [1/3]

std::string rapidsmpf::str ( cudf::column_view  col,
cudf::size_type  index,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 
)

Converts the element at a specific index in a cudf::column_view to a string.

Parameters
colThe column view containing the data.
indexThe index of the element to convert.
streamCUDA stream used for device memory operations and kernel launches.
mrMemory resource for device memory allocation.
Returns
A string representation of the element at the specified index.

◆ str() [2/3]

std::string rapidsmpf::str ( cudf::column_view  col,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 
)

Converts all elements in a cudf::column_view to a string.

Parameters
colThe column view containing the data.
streamCUDA stream used for device memory operations and kernel launches.
mrMemory resource for device memory allocation.
Returns
A string representation of all elements in the column.

◆ str() [3/3]

std::string rapidsmpf::str ( cudf::table_view  tbl,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 
)

Converts all rows in a cudf::table_view to a string.

Parameters
tblThe table view containing the data.
streamCUDA stream used for device memory operations and kernel launches.
mrMemory resource for device memory allocation.
Returns
A string representation of all rows in the table.

◆ stream_pool_from_options()

std::shared_ptr<rmm::cuda_stream_pool> rapidsmpf::stream_pool_from_options ( config::Options  options)

Get a new CUDA stream pool from configuration options.

Parameters
optionsConfiguration options.
Returns
Pool of CUDA streams used throughout RapidsMPF for operations that do not take an explicit CUDA stream.

◆ to_lower()

std::string rapidsmpf::to_lower ( std::string_view  text)

Converts the specified string to lowercase.

Parameters
textThe input string to be processed.
Returns
The trimmed string.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ to_string()

constexpr char const* rapidsmpf::to_string ( MemoryType  mem_type)
constexpr

Get the name of a MemoryType.

Parameters
mem_typeThe memory type.
Returns
The memory type name.

Definition at line 75 of file memory_type.hpp.

◆ to_upper()

std::string rapidsmpf::to_upper ( std::string_view  text)

Converts the specified string to uppercase.

Parameters
textThe input string to be processed.
Returns
The trimmed string.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ to_vector()

template<typename MapType >
auto rapidsmpf::to_vector ( MapType &&  map)

Converts a map-like associative container to a vector by moving the values and discarding the keys.

Template Parameters
MapTypeThe type of the map-like associative container. Must provide a mapped_type and support range-based for-loops.
Parameters
mapThe map whose values will be moved into the resulting vector. Keys are ignored.
Returns
A std::vector containing the moved values from the input map.

Definition at line 166 of file misc.hpp.

◆ trim()

std::string rapidsmpf::trim ( std::string_view  text)

Trims whitespace from both ends of the specified string.

Parameters
textThe input string to be processed.
Returns
The trimmed string.
Examples
/__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ unpack_and_concat()

std::unique_ptr<cudf::table> rapidsmpf::unpack_and_concat ( std::vector< PackedData > &&  partitions,
rmm::cuda_stream_view  stream,
BufferResource br,
AllowOverbooking  allow_overbooking = AllowOverbooking::YES 
)

Unpack (deserialize) input partitions and concatenate them into a single table.

Empty partitions are ignored.

The unpacking of each partition is stream-ordered on that partition's own CUDA stream. The returned table is stream-ordered on the provided stream and synchronized with the unpacking.

Parameters
partitionsPacked input tables (partitions).
streamCUDA stream on which concatenation occurs and on which the resulting table is ordered.
brBuffer resource used for memory allocations.
allow_overbookingIf true, allow overbooking (true by default).
Returns
The concatenated table resulting from unpacking the input partitions.
Exceptions
rapidsmpf::reservation_errorIf the buffer resource cannot reserve enough memory to concatenate all partitions.
std::logic_errorIf the partitions are not in device memory.
See also
partition_and_pack
cudf::unpack
cudf::concatenate

◆ unspill_partitions()

std::vector<PackedData> rapidsmpf::unspill_partitions ( std::vector< PackedData > &&  partitions,
BufferResource br,
AllowOverbooking  allow_overbooking 
)

Move spilled partitions (i.e., packed tables in host memory) back to device memory.

Each partition is inspected to determine whether its buffer resides in device memory. Buffers already in device memory are left untouched. Host-resident buffers are moved to device memory using the provided buffer resource and the buffer's CUDA stream.

If insufficient device memory is available, the buffer resource's spill manager is invoked to free memory. If overbooking occurs and spilling fails to reclaim enough memory, behavior depends on the allow_overbooking flag.

Parameters
partitionsThe partitions to unspill, potentially containing host-resident data.
brBuffer resource responsible for memory reservation and spills.
allow_overbookingIf false, ensures enough memory is freed to satisfy the reservation; otherwise, allows overbooking even if spilling was insufficient.
Returns
A vector of PackedData, each with a buffer in device memory.
Exceptions
rapidsmpf::reservation_errorIf overbooking exceeds the amount spilled and allow_overbooking is false.

Variable Documentation

◆ MEMORY_TYPE_NAMES

constexpr std::array<char const*, MEMORY_TYPES.size()> rapidsmpf::MEMORY_TYPE_NAMES
constexpr
Initial value:
{
{"DEVICE", "PINNED_HOST", "HOST"}
}

Memory type names sorted to match MemoryType and MEMORY_TYPES.

Definition at line 28 of file memory_type.hpp.

◆ MEMORY_TYPES

constexpr std::array<MemoryType, 3> rapidsmpf::MEMORY_TYPES
constexpr
Initial value:
{
{MemoryType::DEVICE, MemoryType::PINNED_HOST, MemoryType::HOST}
}

All memory types sorted in decreasing order of preference.

Definition at line 23 of file memory_type.hpp.

◆ SPILL_TARGET_MEMORY_TYPES

constexpr std::array<MemoryType, 2> rapidsmpf::SPILL_TARGET_MEMORY_TYPES
constexpr
Initial value:
{
{MemoryType::PINNED_HOST, MemoryType::HOST}
}

Memory types that are valid spill destinations in decreasing order of preference.

This array defines the preferred targets for spilling when device memory is insufficient. The ordering reflects the policy of spilling in RapidsMPF, where earlier entries are considered more desirable spill destinations.

Definition at line 40 of file memory_type.hpp.