RAPIDS Multi-Processor interfaces. More...

Namespaces
	bootstrap

	coll
	Collective communication interfaces.

	communicator

	config

	detail

	mpi
	Collection of helpful MPI functions.

	rrun

	shuffler
	Shuffler interfaces.

	streaming

Classes
class	Tag
	A tag used for identifying messages in a communication operation. More...

class	Communicator
	Abstract base class for a communication mechanism between nodes. More...

class	MPI
	MPI communicator class that implements the `Communicator` interface. More...

class	Single
	Single process communicator class that implements the `Communicator` interface. More...

class	CudaEvent
	RAII wrapper for a CUDA event with convenience methods. More...

struct	cuda_error
	Exception thrown when a CUDA error is encountered. More...

class	bad_alloc
	Exception thrown when a RapidsMPF allocation fails. More...

class	out_of_memory
	Exception thrown when RapidsMPF runs out of memory. More...

class	reservation_error
	Exception thrown when a memory reservation fails in RapidsMPF. More...

struct	BloomFilter
	A bloom filter, used for approximate set membership queries. More...

class	Buffer
	Buffer representing device or host memory. More...

class	BufferResource
	Class managing buffer resources. More...

class	LimitAvailableMemory
	A functor for querying the remaining available memory within a defined limit from an RMM statistics resource. More...

class	ContentDescription
	Description of an object's content. More...

class	HostBuffer
	Block of host memory. More...

class	HostMemoryResource
	Host memory resource using standard CPU allocation. More...

class	MemoryReservation
	Represents a reservation for future memory allocation. More...

struct	PackedData
	Bag of bytes with metadata suitable for sending over the wire. More...

struct	PinnedPoolProperties
	Properties for configuring a pinned memory pool. More...

class	PinnedMemoryResource
	Memory resource that provides pinned (page-locked) host memory using a pool. More...

struct	ScopedMemoryRecord
	Memory statistics for a specific scope. More...

class	SpillManager
	Manages memory spilling to free up device memory when needed. More...

class	OwningWrapper
	Utility class to store an arbitrary type-erased object while another object is alive. More...

class	ProgressThread
	A progress thread that can execute arbitrary functions. More...

class	RmmResourceAdaptor
	A RMM memory resource adaptor tailored to RapidsMPF. More...

class	Statistics
	Tracks statistics across rapidsmpf operations. More...

class	StreamOrderedTiming
	Stream-ordered wall-clock timer that records its result into Statistics. More...

struct	overloaded
	Helper for overloaded lambdas using std::visit. More...

Typedefs
using	Rank = std::int32_t
	The rank of a node (e.g. the rank of a MPI process), or world size (total number of ranks). More...

using	OpID = std::int32_t
	Operation ID defined by the user. This allows users to concurrently execute multiple operations, and each operation will be identified by its OpID. More...

using	StageID = std::int32_t
	Identifier for a stage of a communication operation. More...

using	any_device_resource = cuda::mr::any_resource< cuda::mr::device_accessible >
	Owning type-erased device memory resource.

using	any_host_device_resource = cuda::mr::any_resource< cuda::mr::host_accessible, cuda::mr::device_accessible >
	Owning type-erased host- and device-accessible memory resource.

using	Clock = std::chrono::high_resolution_clock
	Alias for high-resolution clock from the chrono library.

using	Duration = std::chrono::duration< double >
	Alias for a duration type representing time in seconds as a double.

using	TimePoint = std::chrono::time_point< Clock, Duration >
	Alias for a time point with double precision in seconds.

Enumerations
enum class	AllowOverbooking : bool { NO , YES }
	Policy controlling whether a memory reservation is allowed to overbook. More...

enum class	MemoryType : int { DEVICE = 0 , PINNED_HOST = 1 , HOST = 2 }
	Enum representing the type of memory sorted in decreasing order of preference. More...

enum class	TrimZeroFraction { NO , YES }
	Control whether a zero fractional part is omitted when formatting values. More...

Functions
std::ostream &	operator<< (std::ostream &os, Communicator const &obj)
	Overloads the stream insertion operator for the Communicator class. More...

template<typename Range1 , typename Range2 >
void	cuda_stream_join (Range1 const &downstreams, Range2 const &upstreams, CudaEvent *event=nullptr)
	Make downstream CUDA streams wait on upstream CUDA streams. More...

void	cuda_stream_join (rmm::cuda_stream_view downstream, rmm::cuda_stream_view upstream, CudaEvent *event=nullptr)
	Make a downstream CUDA stream wait on an upstream CUDA stream. More...

std::pair< std::vector< cudf::table_view >, std::unique_ptr< cudf::table > >	partition_and_split (cudf::table_view const &table, std::vector< cudf::size_type > const &columns_to_hash, int num_partitions, cudf::hash_id hash_function, std::uint32_t seed, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
	Partitions rows from the input table into multiple output tables. More...

std::unordered_map< shuffler::PartID, PackedData >	partition_and_pack (cudf::table_view const &table, std::vector< cudf::size_type > const &columns_to_hash, int num_partitions, cudf::hash_id hash_function, std::uint32_t seed, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
	Partitions rows from the input table into multiple packed (serialized) tables. More...

std::unordered_map< shuffler::PartID, PackedData >	split_and_pack (cudf::table_view const &table, std::vector< cudf::size_type > const &splits, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
	Splits rows from the input table into multiple packed (serialized) tables. More...

std::unique_ptr< cudf::table >	unpack_and_concat (std::vector< PackedData > &&partitions, rmm::cuda_stream_view stream, BufferResource *br, AllowOverbooking allow_overbooking=AllowOverbooking::YES)
	Unpack (deserialize) input partitions and concatenate them into a single table. More...

std::vector< PackedData >	spill_partitions (std::vector< PackedData > &&partitions, BufferResource *br)
	Spill partitions from device memory to host memory. More...

std::vector< PackedData >	unspill_partitions (std::vector< PackedData > &&partitions, BufferResource *br, AllowOverbooking allow_overbooking)
	Move spilled partitions (i.e., packed tables in host memory) back to device memory. More...

std::string	str (cudf::column_view col, cudf::size_type index, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Converts the element at a specific index in a `cudf::column_view` to a string. More...

std::string	str (cudf::column_view col, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Converts all elements in a `cudf::column_view` to a string. More...

std::string	str (cudf::table_view tbl, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
	Converts all rows in a `cudf::table_view` to a string. More...

std::size_t	estimated_memory_usage (cudf::column_view const &col, rmm::cuda_stream_view stream)
	Estimate the memory usage of a column. More...

std::size_t	estimated_memory_usage (cudf::table_view const &tbl, rmm::cuda_stream_view stream)
	Estimate the memory usage of a table. More...

void	buffer_copy (std::shared_ptr< Statistics > statistics, Buffer &dst, Buffer const &src, std::size_t size, std::ptrdiff_t dst_offset=0, std::ptrdiff_t src_offset=0)
	Asynchronously copy data between buffers. More...

std::unordered_map< MemoryType, BufferResource::MemoryAvailable >	memory_available_from_options (RmmResourceAdaptor mr, config::Options options)
	Construct a map of memory-available functions from configuration options. More...

std::optional< Duration >	periodic_spill_check_from_options (config::Options options)
	Get the `periodic_spill_check` parameter from configuration options. More...

std::shared_ptr< rmm::cuda_stream_pool >	stream_pool_from_options (config::Options options)
	Get a new CUDA stream pool from configuration options. More...

cudaError_t	cuda_memcpy_async (void dst, void const src, std::size_t count, rmm::cuda_stream_view stream)
	Asynchronously copies memory between host and/or device buffers. More...

constexpr std::span< MemoryType const >	leq_memory_types (MemoryType mem_type) noexcept
	Get the memory types with preference lower than or equal to `mem_type`. More...

constexpr char const *	to_string (MemoryType mem_type)
	Get the name of a MemoryType. More...

std::ostream &	operator<< (std::ostream &os, MemoryType mem_type)
	Overload to write type name to the output stream. More...

std::istream &	operator>> (std::istream &is, MemoryType &out)
	Overload to read a MemoryType value from an input stream. More...

bool	is_pinned_memory_resources_supported ()
	Checks if the PinnedMemoryResource is supported for the current CUDA version. More...

std::uint64_t	get_total_host_memory () noexcept
	Get the total amount of system memory. More...

int	get_current_numa_node () noexcept
	Get the NUMA node ID associated with the calling CPU thread. More...

std::vector< int >	get_current_numa_nodes () noexcept
	Get current NUMA node(s) for memory binding. More...

std::uint64_t	get_numa_node_host_memory (int numa_id=get_current_numa_node()) noexcept
	Get the total amount of host memory for a NUMA node. More...

std::uint64_t	get_host_memory_per_gpu ()
	Get the amount of host memory per GPU. More...

template<typename MapType >
std::pair< typename MapType::key_type, typename MapType::mapped_type >	extract_item (MapType &map, typename MapType::const_iterator position)
	Extracts a key-value pair from a map, removing it from the map. More...

template<typename MapType >
std::pair< typename MapType::key_type, typename MapType::mapped_type >	extract_item (MapType &map, typename MapType::key_type const &key)
	Extracts a key-value pair from a map, removing it from the map. More...

template<typename MapType >
MapType::mapped_type	extract_value (MapType &map, typename MapType::key_type const &key)
	Extracts the value associated with a specific key from a map, removing the key-value pair. More...

template<typename MapType >
MapType::mapped_type	extract_value (MapType &map, typename MapType::const_iterator position)
	Extracts the value associated with a specific key from a map, removing the key-value pair. More...

template<typename MapType >
MapType::key_type	extract_key (MapType &map, typename MapType::key_type const &key)
	Extracts a key from a map, removing the key-value pair. More...

template<typename MapType >
MapType::key_type	extract_key (MapType &map, typename MapType::const_iterator position)
	Extracts a key from a map, removing the key-value pair. More...

template<typename MapType >
auto	to_vector (MapType &&map)
	Converts a map-like associative container to a vector by moving the values and discarding the keys. More...

bool	is_running_under_valgrind ()
	Checks whether the application is running under Valgrind. More...

template<typename T >
constexpr T	safe_div (T x, T y)
	Performs safe division, returning 0 if the denominator is zero. More...

template<std::ranges::input_range R, typename T , typename Proj = std::identity>
constexpr bool	contains (R &&range, T const &value, Proj proj={})
	Backport of `std::ranges::contains` from C++23 for C++20. More...

template<typename To , typename From >
requires std::is_arithmetic_v< To > &&constexpr std::is_arithmetic_v< From > To	safe_cast (From value, std::source_location const &loc=std::source_location::current())
	Safely casts a numeric value to another type with overflow checking. More...

std::string	trim (std::string_view text)
	Trims whitespace from both ends of the specified string. More...

std::string	to_lower (std::string_view text)
	Converts the specified string to lowercase. More...

std::string	to_upper (std::string_view text)
	Converts the specified string to uppercase. More...

std::string	format_nbytes (double nbytes, int num_decimals=2, TrimZeroFraction trim_zero_fraction=TrimZeroFraction::YES)
	Format a byte count as a human-readable string using IEC units. More...

std::string	format_duration (double seconds, int precision=2, TrimZeroFraction trim_zero_fraction=TrimZeroFraction::YES)
	Format a time duration as a human-readable string. More...

std::int64_t	parse_nbytes (std::string_view text)
	Parse a human-readable byte count into an integer number of bytes. More...

std::size_t	parse_nbytes_unsigned (std::string_view text)
	Parse a human-readable byte count into a non-negative number of bytes. More...

std::size_t	parse_nbytes_or_percent (std::string_view text, double total_bytes)
	Parse a byte quantity or percentage into an absolute byte count. More...

Duration	parse_duration (std::string_view text)
	Parse a human-readable time duration into seconds. More...

template<typename T >
T	parse_string (std::string const &text)
	Specialization of `parse_string` for boolean values. More...

template<>
bool	parse_string (std::string const &text)
	Specialization of `parse_string` for boolean values. More...

std::optional< std::string >	parse_optional (std::string text)
	Parse an optional string value. More...

std::vector< std::string >	parse_string_list (std::string_view text, char delimiter=',')
	Parse a delimited string into a list of trimmed substrings. More...

Variables
constexpr bool	COMM_HAVE_UCXX = false
	Whether RapidsMPF was built with the UCXX Communicator.

constexpr bool	COMM_HAVE_MPI = false
	Whether RapidsMPF was built with the MPI Communicator.

constexpr std::array< MemoryType, 3 >	MEMORY_TYPES
	All memory types sorted in decreasing order of preference. More...

constexpr std::array< char const *, MEMORY_TYPES.size()>	MEMORY_TYPE_NAMES
	Memory type names sorted to match `MemoryType` and `MEMORY_TYPES`. More...

constexpr std::array< MemoryType, 2 >	SPILL_TARGET_MEMORY_TYPES
	Memory types that are valid spill destinations in decreasing order of preference. More...

Detailed Description

RAPIDS Multi-Processor interfaces.

Typedef Documentation

◆ OpID

rapidsmpf::OpID

Operation ID defined by the user. This allows users to concurrently execute multiple operations, and each operation will be identified by its OpID.

Note: Although typed as an int32, the number of distinct operations is limited to 2^20.

Definition at line 44 of file communicator.hpp.

◆ Rank

rapidsmpf::Rank

The rank of a node (e.g. the rank of a MPI process), or world size (total number of ranks).

Note: Ranks are always consecutive integers from zero to the total number of ranks.

Definition at line 34 of file communicator.hpp.

◆ StageID

rapidsmpf::StageID

Identifier for a stage of a communication operation.

Note: Although typed as an int32, the number of distinct stages is limited to 2^3.

Definition at line 53 of file communicator.hpp.

Enumeration Type Documentation

◆ AllowOverbooking

enum rapidsmpf::AllowOverbooking : bool

strong

Policy controlling whether a memory reservation is allowed to overbook.

This enum is used throughout RapidsMPF to specify the overbooking behavior of a memory reservation request. The exact semantics depend on the specific API and execution context in which it is used.

Enumerator
NO	Overbooking is not allowed.
YES	Overbooking is allowed.

Definition at line 40 of file buffer_resource.hpp.

◆ MemoryType

enum rapidsmpf::MemoryType : int

strong

Enum representing the type of memory sorted in decreasing order of preference.

Enumerator
DEVICE	Device memory.
PINNED_HOST	Pinned host memory.
HOST	Host memory.

Definition at line 16 of file memory_type.hpp.

◆ TrimZeroFraction

enum rapidsmpf::TrimZeroFraction

strong

Control whether a zero fractional part is omitted when formatting values.

Enumerator
NO	Always keep the fractional part.
YES	Omit the fractional part when it consists only of zeros.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

Definition at line 43 of file string.hpp.

Function Documentation

◆ buffer_copy()

void rapidsmpf::buffer_copy	(	std::shared_ptr< Statistics >	statistics,
		Buffer &	dst,
		Buffer const &	src,
		std::size_t	size,
		std::ptrdiff_t	dst_offset = `0`,
		std::ptrdiff_t	src_offset = `0`
	)

Asynchronously copy data between buffers.

Copies size bytes from src, starting at src_offset, into dst at dst_offset.

Parameters

statistics	Statistics object used to record the copy operation. Use `Statistics::disabled()` to skip recording.
dst	Destination buffer.
src	Source buffer.
size	Number of bytes to copy.
dst_offset	Byte offset into the destination buffer.
src_offset	Byte offset into the source buffer.

Exceptions

std::invalid_argument If the requested range is out of bounds.

◆ contains()

template<std::ranges::input_range R, typename T , typename Proj = std::identity>

constexpr bool rapidsmpf::contains	(	R &&	range,
		T const &	value,
		Proj	proj = `{}`
	)

constexpr

Backport of std::ranges::contains from C++23 for C++20.

Checks whether a range contains a given value.

Template Parameters

R	An input range type.
T	The type of the value to search for.
Proj	A projection function applied to each element before comparison.

Parameters

range	The range to search.
value	The value to search for in the range.
proj	The projection to apply to each element before comparison.

Returns: true if any element in the range compares equal to value after projection, false otherwise.

Definition at line 281 of file misc.hpp.

◆ cuda_memcpy_async()

cudaError_t rapidsmpf::cuda_memcpy_async	(	void *	dst,
		void const *	src,
		std::size_t	count,
		rmm::cuda_stream_view	stream
	)

inline

Asynchronously copies memory between host and/or device buffers.

The copy direction is inferred from the pointer types (cudaMemcpyDefault). The source buffer must remain valid until the stream executes the copy.

This function should be used instead of cudaMemcpyAsync, as it provides improved semantics for asynchronous copies, especially from pageable host memory.

Background

The legacy cudaMemcpyAsync API accesses non-CUDA-registered host pointers (e.g., allocations from malloc or new) at the time of the API call, rather than in stream order. This behavior originates from earlier GPU architectures that could not directly access such memory, requiring an immediate CPU-side staging step.

Modern systems with HMM/ATS allow GPUs to access these pointers directly. However, the semantics of cudaMemcpyAsync cannot be changed without breaking existing code. The batched memcpy APIs (e.g., cudaMemcpyBatchAsync) introduced in CUDA 13.0 allow the caller to specify cudaMemcpySrcAccessOrderStream, ensuring that the source is accessed in stream order and enabling true asynchronous copies from pageable host memory.

Parameters

dst	Destination memory address.
src	Source memory address.
count	Number of bytes to copy.
stream	CUDA stream on which the copy is enqueued.

Returns: cudaError_t CUDA error code.

Definition at line 44 of file cuda_memcpy_async.hpp.

◆ cuda_stream_join() [1/2]

template<typename Range1 , typename Range2 >

void rapidsmpf::cuda_stream_join	(	Range1 const &	downstreams,
		Range2 const &	upstreams,
		CudaEvent *	event = `nullptr`
	)

Make downstream CUDA streams wait on upstream CUDA streams.

This call is asynchronous with respect to the host thread; no host-side blocking occurs.

Template Parameters

Range1	Iterable whose elements are rmm::cuda_stream_view.
Range2	Iterable whose elements are rmm::cuda_stream_view.

Parameters

downstreams	Streams that must not run ahead.
upstreams	Streams whose already-enqueued work must complete first.
event	Optional CUDA event used for synchronization. A unique event per call is not required; the same event may be reused. If `nullptr`, a temporary event is created internally. The reason to provide an event is to avoid the small overhead of constructing a temporary one.

Note: If all upstream and downstream streams are identical, this function is a no-op.

Definition at line 34 of file cuda_stream.hpp.

◆ cuda_stream_join() [2/2]

void rapidsmpf::cuda_stream_join	(	rmm::cuda_stream_view	downstream,
		rmm::cuda_stream_view	upstream,
		CudaEvent *	event = `nullptr`
	)

inline

Make a downstream CUDA stream wait on an upstream CUDA stream.

This call is asynchronous with respect to the host thread; no host-side blocking occurs.

Equivalent to calling the range overload with one upstream and one downstream.

Parameters

downstream	Stream that must not run ahead.
upstream	Stream whose already-enqueued work must complete first.
event	Optional CUDA event used for synchronization. A unique event per call is not required; the same event may be reused. If `nullptr`, a temporary event is created internally to avoid the small overhead of constructing one per call site.

Note: If downstream and upstream are identical, this function is a no-op.

See also: cuda_stream_join(Range1 const&, Range2 const&, CudaEvent*)

Definition at line 91 of file cuda_stream.hpp.

◆ estimated_memory_usage() [1/2]

std::size_t rapidsmpf::estimated_memory_usage	(	cudf::column_view const &	col,
		rmm::cuda_stream_view	stream
	)

Estimate the memory usage of a column.

Parameters

col	The column to estimate the memory usage of.
stream	CUDA stream used for device memory operations and kernel launches.

Returns: The estimated memory usage of the column.

◆ estimated_memory_usage() [2/2]

std::size_t rapidsmpf::estimated_memory_usage	(	cudf::table_view const &	tbl,
		rmm::cuda_stream_view	stream
	)

Estimate the memory usage of a table.

Parameters

tbl	The table to estimate the memory usage of.
stream	CUDA stream used for device memory operations and kernel launches.

Returns: The estimated memory usage of the table.

◆ extract_item() [1/2]

template<typename MapType >

std::pair<typename MapType::key_type, typename MapType::mapped_type> rapidsmpf::extract_item	(	MapType &	map,
		typename MapType::const_iterator	position
	)

Extracts a key-value pair from a map, removing it from the map.

Template Parameters

MapType The type of the associative container.

Parameters

map	The map from which to extract the key-value pair.
position	Const iterator pointing to a node in the map.

Returns: A pair containing the extracted key and value.

Note: Invalidates any iterators to the extracted element (notably position).

Exceptions

std::out_of_range If the iterator is not found in the map.

Definition at line 50 of file misc.hpp.

◆ extract_item() [2/2]

template<typename MapType >

std::pair<typename MapType::key_type, typename MapType::mapped_type> rapidsmpf::extract_item	(	MapType &	map,
		typename MapType::key_type const &	key
	)

Extracts a key-value pair from a map, removing it from the map.

Template Parameters

MapType The type of the associative container.

Parameters

map	The map from which to extract the key-value pair.
key	The key to extract.

Returns: A pair containing the extracted key and value.

Exceptions

std::out_of_range If the key is not found in the map.

Definition at line 71 of file misc.hpp.

◆ extract_key() [1/2]

template<typename MapType >

MapType::key_type rapidsmpf::extract_key	(	MapType &	map,
		typename MapType::const_iterator	position
	)

Extracts a key from a map, removing the key-value pair.

Template Parameters

MapType The type of the associative container.

Parameters

map	The map from which to extract the key.
position	Const iterator pointing to a node in the map.

Returns: The extracted key.

Note: Invalidates any iterators to the extracted element (notably position).

Exceptions

std::out_of_range If the key is not found in the map.

Definition at line 149 of file misc.hpp.

◆ extract_key() [2/2]

template<typename MapType >

MapType::key_type rapidsmpf::extract_key	(	MapType &	map,
		typename MapType::key_type const &	key
	)

Extracts a key from a map, removing the key-value pair.

Template Parameters

MapType The type of the associative container.

Parameters

map	The map from which to extract the key.
key	The key to extract.

Returns: The extracted key.

Exceptions

std::out_of_range If the key is not found in the map.

Definition at line 130 of file misc.hpp.

◆ extract_value() [1/2]

template<typename MapType >

MapType::mapped_type rapidsmpf::extract_value	(	MapType &	map,
		typename MapType::const_iterator	position
	)

Extracts the value associated with a specific key from a map, removing the key-value pair.

Template Parameters

MapType The type of the associative container.

Parameters

map	The map from which to extract the value.
position	Const iterator pointing to a node in the map.

Returns: The extracted value.

Note: Invalidates any iterators to the extracted element (notably position).

Exceptions

std::out_of_range If the key is not found in the map.

Definition at line 113 of file misc.hpp.

◆ extract_value() [2/2]

template<typename MapType >

MapType::mapped_type rapidsmpf::extract_value	(	MapType &	map,
		typename MapType::key_type const &	key
	)

Extracts the value associated with a specific key from a map, removing the key-value pair.

Template Parameters

MapType The type of the associative container.

Parameters

map	The map from which to extract the value.
key	The key associated with the value to extract.

Returns: The extracted value.

Exceptions

std::out_of_range If the key is not found in the map.

Definition at line 93 of file misc.hpp.

◆ format_duration()

std::string rapidsmpf::format_duration	(	double	seconds,
		int	precision = `2`,
		TrimZeroFraction	trim_zero_fraction = `TrimZeroFraction::YES`
	)

Format a time duration as a human-readable string.

Converts a duration given in seconds into a scaled string representation using common time units such as ns, us, ms, s, min, h, and d.

The duration is accepted as a double to support both fractional seconds and very large values without overflow.

Negative values are supported and are formatted with a leading minus sign, which is useful when representing signed time deltas.

Decimal formatting is controlled by precision. When trim_zero_fraction is set to TrimZeroFraction::YES, the fractional part is omitted entirely if all decimal digits are zero. Otherwise, the specified number of decimal places is preserved.

Parameters

seconds	Time duration to format, in seconds.
precision	Number of decimal places to include in the formatted value.
trim_zero_fraction	Whether to omit the fractional part when it consists only of zeros.

Returns: Human-readable string representation of the time duration.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ format_nbytes()

std::string rapidsmpf::format_nbytes	(	double	nbytes,
		int	num_decimals = `2`,
		TrimZeroFraction	trim_zero_fraction = `TrimZeroFraction::YES`
	)

Format a byte count as a human-readable string using IEC units.

Converts an integer byte count into a scaled string representation using binary (base-1024) units such as KiB, MiB, and GiB.

Negative values are supported and are formatted with a leading minus sign, which is useful when representing signed byte deltas or accounting values.

Decimal formatting is controlled by precision. When trim_zero_fraction is set to TrimZeroFraction::YES, the fractional part is omitted entirely if all decimal digits are zero. Otherwise, the specified number of decimal places is preserved.

Examples:

1024 bytes with 2 decimals → "1.00 KiB" or "1 KiB" (trimmed)
1536 bytes with 2 decimals → "1.50 KiB"

Parameters

nbytes	Signed number of bytes to format, provided as a double to support any integer magnitude.
num_decimals	Number of decimal places to include in the formatted value.
trim_zero_fraction	Whether to omit the fractional part when it consists only of zeros.

Returns: Human-readable string representation of the byte count.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ get_current_numa_node()

int rapidsmpf::get_current_numa_node ( )

noexcept

Get the NUMA node ID associated with the calling CPU thread.

A NUMA (Non-Uniform Memory Access) node represents a group of CPU cores and memory that have faster access to each other than to memory attached to other nodes. On NUMA systems, binding allocations and threads to the same NUMA node can significantly reduce memory access latency and improve bandwidth.

This function returns the NUMA node on which the calling thread is currently executing, as determined by the operating system's CPU and memory topology. The value can change if the thread migrates between CPUs.

If NUMA support is not available on the system or cannot be queried, the function returns 0, which corresponds to the single implicit NUMA node on non-NUMA systems.

Returns: The NUMA node ID of the calling thread, or 0 if NUMA is unavailable.

◆ get_current_numa_nodes()

std::vector<int> rapidsmpf::get_current_numa_nodes ( )

noexcept

Get current NUMA node(s) for memory binding.

Queries the NUMA node associated with the CPU on which the calling thread is currently executing. This is a best-effort approach and may not be accurate in all cases.

Since processes are typically scheduled on CPUs that are local to their memory, using the CPU's NUMA node (via numa_node_of_cpu) provides a reasonable approximation that works well in practice for topology-aware binding scenarios. This intentionally avoids querying the process memory binding policy programmatically.

If NUMA support is not available or the NUMA node cannot be determined, the function returns a vector containing a single element, 0, which corresponds to the single implicit NUMA node on non-NUMA systems.

Returns: Vector of NUMA node IDs associated with the calling thread.

◆ get_host_memory_per_gpu()

std::uint64_t rapidsmpf::get_host_memory_per_gpu ( )

Get the amount of host memory per GPU.

This is calculated as the total host memory available for the current NUMA node divided by the number of GPUs bound to that NUMA node.

Exceptions

std::runtime_error if no GPUs are found on the current NUMA node.

Returns: Amount of host memory per GPU in bytes.

◆ get_numa_node_host_memory()

std::uint64_t rapidsmpf::get_numa_node_host_memory ( int numa_id = get_current_numa_node() )

noexcept

Get the total amount of host memory for a NUMA node.

Parameters

numa_id NUMA node for which to query the total host memory. Defaults to the current NUMA node as returned by get_current_numa_node().

Note: If NUMA support is not available or the node size cannot be determined, this function falls back to returning the total host memory.

Returns: Total host memory of the NUMA node in bytes.

◆ get_total_host_memory()

std::uint64_t rapidsmpf::get_total_host_memory ( )

noexcept

Get the total amount of system memory.

Returns: Total host memory in bytes.

Note: On WSL and in containerized environments, the returned value reflects the memory visible to the Linux kernel instance, which may differ from the physical memory of the host.; Terminates the process if sysconf(_SC_PAGE_SIZE) or sysconf(_SC_PHYS_PAGES) fails.

◆ is_pinned_memory_resources_supported()

bool rapidsmpf::is_pinned_memory_resources_supported ( )

inline

Checks if the PinnedMemoryResource is supported for the current CUDA version.

RapidsMPF requires CUDA 12.6 or newer to support pinned memory resources.

Returns: True if the PinnedMemoryResource is supported for the current CUDA version, false otherwise.

Definition at line 44 of file pinned_memory_resource.hpp.

◆ is_running_under_valgrind()

bool rapidsmpf::is_running_under_valgrind ( )

Checks whether the application is running under Valgrind.

Returns: true if the application is running under Valgrind, false otherwise.

◆ leq_memory_types()

constexpr std::span<MemoryType const> rapidsmpf::leq_memory_types ( MemoryType mem_type )

constexprnoexcept

Get the memory types with preference lower than or equal to mem_type.

The returned span reflects the predefined ordering used in MEMORY_TYPES, which lists memory types in decreasing order of preference.

Parameters

mem_type The memory type used as the starting point.

Returns: A span of memory types whose preference is lower than or equal to the given type.

Definition at line 54 of file memory_type.hpp.

◆ memory_available_from_options()

std::unordered_map<MemoryType, BufferResource::MemoryAvailable> rapidsmpf::memory_available_from_options	(	RmmResourceAdaptor	mr,
		config::Options	options
	)

Construct a map of memory-available functions from configuration options.

Parameters

mr	The RMM resource adaptor.
options	Configuration options.

Returns: The map of memory-available functions.

◆ operator<<() [1/2]

std::ostream& rapidsmpf::operator<<	(	std::ostream &	os,
		Communicator const &	obj
	)

inline

Overloads the stream insertion operator for the Communicator class.

This function allows a description of a Communicator to be written to an output stream.

Parameters

os	The output stream to write to.
obj	The object to write.

Returns: A reference to the modified output stream.

Definition at line 653 of file communicator.hpp.

◆ operator<<() [2/2]

std::ostream& rapidsmpf::operator<<	(	std::ostream &	os,
		MemoryType	mem_type
	)

Overload to write type name to the output stream.

Parameters

os	The output stream.
mem_type	The memory type to write name of to the output stream.

Returns: The output stream.

◆ operator>>()

std::istream& rapidsmpf::operator>>	(	std::istream &	is,
		MemoryType &	out
	)

Overload to read a MemoryType value from an input stream.

Parsing is case-insensitive. Supported values are: "DEVICE", "PINNED_HOST", "PINNED", "PINNED-HOST", and "HOST".

If token extraction from the stream fails, the stream state is preserved. If extraction succeeds but the token does not represent a valid MemoryType, the stream failbit is set.

Parameters

is	The input stream.
out	The memory type read from the input stream.

Returns: The input stream.

◆ parse_duration()

Duration rapidsmpf::parse_duration ( std::string_view text )

Parse a human-readable time duration into seconds.

Parses a numeric value followed by an optional time unit suffix and converts it to a Duration, which represents a time interval in seconds as a double.

Supported units:

Nanoseconds: ns
Microseconds: µs or us
Milliseconds: ms
Seconds: s
Minutes: m or min
Hours: h
Days: d

Units are case-insensitive. If no unit is provided, the value is interpreted as seconds.

The numeric portion may be specified using integer, decimal, or scientific notation (e.g. "1e3", "2.5E-2"). Negative values are supported.

Parameters

text	Time duration string to parse.

Returns: Parsed duration in seconds.

Exceptions

std::invalid_argument	If the string format is invalid or the unit is not recognized.
std::out_of_range	If the parsed value is not finite.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_nbytes()

std::int64_t rapidsmpf::parse_nbytes ( std::string_view text )

Parse a human-readable byte count into an integer number of bytes.

Parses a numeric value followed by an optional unit suffix and converts it to a byte count. Both IEC (base-1024) and SI (base-1000) units are supported.

Supported units:

Bytes: B
IEC (base-1024): KiB, MiB, GiB, TiB, PiB, EiB, ZiB, YiB
SI (base-1000): KB, MB, GB, TB, PB, EB, ZB, YB

Units are case-insensitive. If no unit is provided, the value is interpreted as bytes.

The numeric portion may be specified using integer, decimal, or scientific notation (e.g. "1e6", "2.5E-3"). The final byte count is rounded to the nearest integer, with ties rounded away from zero.

Parameters

text	Byte count string to parse.

Returns: Parsed byte count in bytes.

Exceptions

std::invalid_argument	If the string format is invalid or the unit is not recognized.
std::out_of_range	If the parsed value is not finite or the resulting byte count overflows a 64-bit signed integer.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_nbytes_or_percent()

std::size_t rapidsmpf::parse_nbytes_or_percent	(	std::string_view	text,
		double	total_bytes
	)

Parse a byte quantity or percentage into an absolute byte count.

The input may be a human-readable byte string (e.g. "1GiB", "512MB") or a percentage (e.g. "25%"). See parse_nbytes_unsigned for the exact parsing semantics of the numeric part.

If text ends with '', the numeric part is first parsed using parse_nbytes_unsigned, then interpreted as a percentage of total_bytes.

Otherwise, text is parsed as an absolute byte value and returned as-is.

Parameters

text	Input string representing a byte quantity or percentage.
total_bytes	Total number of bytes used when `text` is a percentage. Must be positive.

Returns: Absolute number of bytes computed from text.

Exceptions

std::invalid_argument	If the input format is invalid, the value is negative, or if `total_bytes` is not positive.
std::out_of_range	If the parsed or computed value exceeds the representable range.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_nbytes_unsigned()

std::size_t rapidsmpf::parse_nbytes_unsigned ( std::string_view text )

Parse a human-readable byte count into a non-negative number of bytes.

Parses a numeric value followed by an optional unit suffix and converts it to a byte count. Both IEC (base-1024) and SI (base-1000) units are supported.

Supported units:

Bytes: B
IEC (base-1024): KiB, MiB, GiB, TiB, PiB, EiB, ZiB, YiB
SI (base-1000): KB, MB, GB, TB, PB, EB, ZB, YB

Units are case-insensitive. If no unit is provided, the value is interpreted as bytes.

The numeric portion may be specified using integer, decimal, or scientific notation (e.g. "1e6", "2.5E-3"). The final byte count is rounded to the nearest integer, with ties rounded away from zero.

Negative values are not permitted.

Parameters

text	Byte count string to parse.

Returns: Parsed byte count in bytes.

Exceptions

std::invalid_argument	If the string format is invalid, the unit is not recognized, or the parsed value is negative.
std::out_of_range	If the parsed value is not finite or overflows std::size_t.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_optional()

std::optional<std::string> rapidsmpf::parse_optional ( std::string text )

Parse an optional string value.

Returns std::nullopt if the input string represents a disabled value. Otherwise, the input string is returned unchanged.

Disabled values are matched case-insensitively and may include surrounding whitespace. Recognized values include: false, no, off, disable, disabled, none, n/a, and na.

Parameters

text	Input string to parse.

Returns: std::optional<std::string> Parsed optional string.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ parse_string() [1/2]

template<typename T >

T rapidsmpf::parse_string ( std::string const & text )

Specialization of parse_string for boolean values.

Converts the input string to a boolean. This function handles common boolean representations such as true, false, on, off, yes, and no, as well as numeric representations (e.g., 0 or 1). The input is first checked for a numeric value using std::stoi; if that fails, it is lowercased and trimmed before matching against known textual representations.

Parameters

text	String to convert to a boolean.

Returns: The corresponding boolean value.

Exceptions

std::invalid_argument If the string cannot be interpreted as a boolean.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

Definition at line 242 of file string.hpp.

◆ parse_string() [2/2]

template<>

bool rapidsmpf::parse_string ( std::string const & text )

Specialization of parse_string for boolean values.

Converts the input string to a boolean. This function handles common boolean representations such as true, false, on, off, yes, and no, as well as numeric representations (e.g., 0 or 1). The input is first checked for a numeric value using std::stoi; if that fails, it is lowercased and trimmed before matching against known textual representations.

Parameters

text	String to convert to a boolean.

Returns: The corresponding boolean value.

Exceptions

std::invalid_argument If the string cannot be interpreted as a boolean.

Definition at line 242 of file string.hpp.

◆ parse_string_list()

std::vector<std::string> rapidsmpf::parse_string_list	(	std::string_view	text,
		char	delimiter = `','`
	)

Parse a delimited string into a list of trimmed substrings.

Splits the input string by the specified delimiter and returns a vector of trimmed tokens. Leading and trailing whitespace is removed from each token.

If the input string is empty or contains only whitespace, an empty vector is returned.

Parameters

text	Input string to parse.
delimiter	Character to use as the delimiter. Defaults to comma.

Returns: Vector of trimmed strings.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ partition_and_pack()

std::unordered_map<shuffler::PartID, PackedData> rapidsmpf::partition_and_pack	(	cudf::table_view const &	table,
		std::vector< cudf::size_type > const &	columns_to_hash,
		int	num_partitions,
		cudf::hash_id	hash_function,
		std::uint32_t	seed,
		rmm::cuda_stream_view	stream,
		BufferResource *	br,
		AllowOverbooking	allow_overbooking = `AllowOverbooking::YES`
	)

Partitions rows from the input table into multiple packed (serialized) tables.

Parameters

table	The table to partition.
columns_to_hash	Indices of input columns to hash.
num_partitions	The number of partitions to use.
hash_function	Hash function to use.
seed	Seed value to the hash function.
stream	CUDA stream used for device memory operations and kernel launches.
br	Buffer resource for memory allocations.
allow_overbooking	If true, allow overbooking (true by default) // TODO: disable this by default https://github.com/rapidsmpf/rapidsmpf/issues/449

Returns: A map of partition IDs and their packed tables.

Exceptions

std::out_of_range if index is columns_to_hash is invalid

See also: unpack_and_concat; cudf::hash_partition; cudf::pack

◆ partition_and_split()

std::pair<std::vector<cudf::table_view>, std::unique_ptr<cudf::table> > rapidsmpf::partition_and_split	(	cudf::table_view const &	table,
		std::vector< cudf::size_type > const &	columns_to_hash,
		int	num_partitions,
		cudf::hash_id	hash_function,
		std::uint32_t	seed,
		rmm::cuda_stream_view	stream,
		BufferResource *	br,
		AllowOverbooking	allow_overbooking = `AllowOverbooking::YES`
	)

Partitions rows from the input table into multiple output tables.

Parameters

table	The table to partition.
columns_to_hash	Indices of input columns to hash.
num_partitions	The number of partitions.
hash_function	Hash function to use.
seed	Seed value to the hash function.
stream	CUDA stream used for device memory operations and kernel launches.
br	Buffer resource for memory allocations.
allow_overbooking	If true, allow overbooking (true by default)

Returns: A vector of each partition and a table that owns the device memory.

Exceptions

std::out_of_range if index is columns_to_hash is invalid

See also: cudf::hash_partition; cudf::split

◆ periodic_spill_check_from_options()

std::optional<Duration> rapidsmpf::periodic_spill_check_from_options ( config::Options options )

Get the periodic_spill_check parameter from configuration options.

Parameters

options Configuration options.

Returns: The duration of the pause between spill checks or std::nullopt if no dedicated thread should check for spilling.

◆ safe_cast()

template<typename To , typename From >

requires std::is_arithmetic_v<To>&& constexpr std::is_arithmetic_v<From> To rapidsmpf::safe_cast	(	From	value,
		std::source_location const &	loc = `std::source_location::current()`
	)

constexpr

Safely casts a numeric value to another type with overflow checking.

For integral conversions, the value must be representable in the destination type or an exception is thrown.

For conversions involving floating point types, overflow and underflow follow standard floating point semantics. The result may become inf or -inf, or lose precision, without throwing.

Template Parameters

To	The destination type.
From	The source type.

Parameters

value	The value to cast.
loc	Source location (automatically captured).

Returns: To The safely cast value.

Exceptions

std::overflow_error if an integral value cannot be represented in the destination type.

Definition at line 311 of file misc.hpp.

◆ safe_div()

template<typename T >

constexpr T rapidsmpf::safe_div	(	T	x,
		T	y
	)

constexpr

Performs safe division, returning 0 if the denominator is zero.

Template Parameters

T	The numeric type of the operands.

Parameters

x	The numerator.
y	The denominator.

Returns: T The result of x / y, or 0 if y is zero.

Definition at line 192 of file misc.hpp.

◆ spill_partitions()

std::vector<PackedData> rapidsmpf::spill_partitions	(	std::vector< PackedData > &&	partitions,
		BufferResource *	br
	)

Spill partitions from device memory to host memory.

Moves the buffer of each PackedData from device memory to host memory using the provided buffer resource and the buffer's CUDA stream. Partitions that are already in host memory are passed through unchanged.

For device-resident partitions, a host memory reservation is made before moving the buffer. If the reservation fails due to insufficient host memory, an exception is thrown. Overbooking is not allowed.

Parameters

partitions	The partitions to spill.
br	Buffer resource used to reserve host memory and perform the move.

Returns: A vector of PackedData, where each buffer resides in host memory.

Exceptions

rapidsmpf::reservation_error If host memory reservation fails.

◆ split_and_pack()

std::unordered_map<shuffler::PartID, PackedData> rapidsmpf::split_and_pack	(	cudf::table_view const &	table,
		std::vector< cudf::size_type > const &	splits,
		rmm::cuda_stream_view	stream,
		BufferResource *	br,
		AllowOverbooking	allow_overbooking = `AllowOverbooking::YES`
	)

Splits rows from the input table into multiple packed (serialized) tables.

Parameters

table	The table to split and pack into partitions.
splits	The split points, equivalent to cudf::split(), i.e. one less than the number of result partitions.
stream	CUDA stream used for device memory operations and kernel launches.
br	Buffer resource for memory allocations.
allow_overbooking	If true, allow overbooking (true by default) // TODO: disable this by default https://github.com/rapidsmpf/rapidsmpf/issues/449

Returns: A map of partition IDs and their packed tables.

Exceptions

std::out_of_range if the splits are invalid.

See also: unpack_and_concat; cudf::split; partition_and_pack

◆ str() [1/3]

std::string rapidsmpf::str	(	cudf::column_view	col,
		cudf::size_type	index,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Converts the element at a specific index in a cudf::column_view to a string.

Parameters

col	The column view containing the data.
index	The index of the element to convert.
stream	CUDA stream used for device memory operations and kernel launches.
mr	Memory resource for device memory allocation.

Returns: A string representation of the element at the specified index.

◆ str() [2/3]

std::string rapidsmpf::str	(	cudf::column_view	col,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Converts all elements in a cudf::column_view to a string.

Parameters

col	The column view containing the data.
stream	CUDA stream used for device memory operations and kernel launches.
mr	Memory resource for device memory allocation.

Returns: A string representation of all elements in the column.

◆ str() [3/3]

std::string rapidsmpf::str	(	cudf::table_view	tbl,
		rmm::cuda_stream_view	stream = `cudf::get_default_stream()`,
		rmm::device_async_resource_ref	mr = `cudf::get_current_device_resource_ref()`
	)

Converts all rows in a cudf::table_view to a string.

Parameters

tbl	The table view containing the data.
stream	CUDA stream used for device memory operations and kernel launches.
mr	Memory resource for device memory allocation.

Returns: A string representation of all rows in the table.

◆ stream_pool_from_options()

std::shared_ptr<rmm::cuda_stream_pool> rapidsmpf::stream_pool_from_options ( config::Options options )

Get a new CUDA stream pool from configuration options.

Parameters

options Configuration options.

Returns: Pool of CUDA streams used throughout RapidsMPF for operations that do not take an explicit CUDA stream.

◆ to_lower()

std::string rapidsmpf::to_lower ( std::string_view text )

Converts the specified string to lowercase.

Parameters

text	The input string to be processed.

Returns: The trimmed string.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ to_string()

constexpr char const* rapidsmpf::to_string ( MemoryType mem_type )

constexpr

Get the name of a MemoryType.

Parameters

mem_type The memory type.

Returns: The memory type name.

Definition at line 75 of file memory_type.hpp.

◆ to_upper()

std::string rapidsmpf::to_upper ( std::string_view text )

Converts the specified string to uppercase.

Parameters

text	The input string to be processed.

Returns: The trimmed string.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ to_vector()

template<typename MapType >

auto rapidsmpf::to_vector ( MapType && map )

Converts a map-like associative container to a vector by moving the values and discarding the keys.

Template Parameters

MapType The type of the map-like associative container. Must provide a mapped_type and support range-based for-loops.

Parameters

map	The map whose values will be moved into the resulting vector. Keys are ignored.

Returns: A std::vector containing the moved values from the input map.

Definition at line 166 of file misc.hpp.

◆ trim()

std::string rapidsmpf::trim ( std::string_view text )

Trims whitespace from both ends of the specified string.

Parameters

text	The input string to be processed.

Returns: The trimmed string.

Examples: /__w/rapidsmpf/rapidsmpf/cpp/include/rapidsmpf/utils/string.hpp.

◆ unpack_and_concat()

std::unique_ptr<cudf::table> rapidsmpf::unpack_and_concat	(	std::vector< PackedData > &&	partitions,
		rmm::cuda_stream_view	stream,
		BufferResource *	br,
		AllowOverbooking	allow_overbooking = `AllowOverbooking::YES`
	)

Unpack (deserialize) input partitions and concatenate them into a single table.

Empty partitions are ignored.

The unpacking of each partition is stream-ordered on that partition's own CUDA stream. The returned table is stream-ordered on the provided stream and synchronized with the unpacking.

Parameters

partitions	Packed input tables (partitions).
stream	CUDA stream on which concatenation occurs and on which the resulting table is ordered.
br	Buffer resource used for memory allocations.
allow_overbooking	If true, allow overbooking (true by default).

Returns: The concatenated table resulting from unpacking the input partitions.

Exceptions

rapidsmpf::reservation_error	If the buffer resource cannot reserve enough memory to concatenate all partitions.
std::logic_error	If the partitions are not in device memory.

See also: partition_and_pack; cudf::unpack; cudf::concatenate

◆ unspill_partitions()

std::vector<PackedData> rapidsmpf::unspill_partitions	(	std::vector< PackedData > &&	partitions,
		BufferResource *	br,
		AllowOverbooking	allow_overbooking
	)

Move spilled partitions (i.e., packed tables in host memory) back to device memory.

Each partition is inspected to determine whether its buffer resides in device memory. Buffers already in device memory are left untouched. Host-resident buffers are moved to device memory using the provided buffer resource and the buffer's CUDA stream.

If insufficient device memory is available, the buffer resource's spill manager is invoked to free memory. If overbooking occurs and spilling fails to reclaim enough memory, behavior depends on the allow_overbooking flag.

Parameters

partitions	The partitions to unspill, potentially containing host-resident data.
br	Buffer resource responsible for memory reservation and spills.
allow_overbooking	If false, ensures enough memory is freed to satisfy the reservation; otherwise, allows overbooking even if spilling was insufficient.

Returns: A vector of PackedData, each with a buffer in device memory.

Exceptions

rapidsmpf::reservation_error If overbooking exceeds the amount spilled and allow_overbooking is false.

Variable Documentation

◆ MEMORY_TYPE_NAMES

constexpr std::array<char const*, MEMORY_TYPES.size()> rapidsmpf::MEMORY_TYPE_NAMES

constexpr

Initial value:

{
    {"DEVICE", "PINNED_HOST", "HOST"}
}

Memory type names sorted to match MemoryType and MEMORY_TYPES.

Definition at line 28 of file memory_type.hpp.

◆ MEMORY_TYPES

constexpr std::array<MemoryType, 3> rapidsmpf::MEMORY_TYPES

constexpr

Initial value:

{
    {MemoryType::DEVICE, MemoryType::PINNED_HOST, MemoryType::HOST}
}

All memory types sorted in decreasing order of preference.

Definition at line 23 of file memory_type.hpp.

◆ SPILL_TARGET_MEMORY_TYPES

constexpr std::array<MemoryType, 2> rapidsmpf::SPILL_TARGET_MEMORY_TYPES

constexpr

Initial value:

{
    {MemoryType::PINNED_HOST, MemoryType::HOST}
}

Memory types that are valid spill destinations in decreasing order of preference.

This array defines the preferred targets for spilling when device memory is insufficient. The ordering reflects the policy of spilling in RapidsMPF, where earlier entries are considered more desirable spill destinations.

Definition at line 40 of file memory_type.hpp.

Namespaces

Classes

Typedefs

Enumerations

Functions

Variables

Detailed Description

Typedef Documentation

◆ OpID

◆ Rank

◆ StageID

Enumeration Type Documentation

◆ AllowOverbooking

◆ MemoryType

◆ TrimZeroFraction

Function Documentation

◆ buffer_copy()

◆ contains()

◆ cuda_memcpy_async()

Background

◆ cuda_stream_join() [1/2]

◆ cuda_stream_join() [2/2]

◆ estimated_memory_usage() [1/2]

◆ estimated_memory_usage() [2/2]

◆ extract_item() [1/2]

◆ extract_item() [2/2]

◆ extract_key() [1/2]

◆ extract_key() [2/2]

◆ extract_value() [1/2]

◆ extract_value() [2/2]

◆ format_duration()

◆ format_nbytes()

◆ get_current_numa_node()

◆ get_current_numa_nodes()

◆ get_host_memory_per_gpu()

◆ get_numa_node_host_memory()

◆ get_total_host_memory()

◆ is_pinned_memory_resources_supported()

◆ is_running_under_valgrind()

◆ leq_memory_types()

◆ memory_available_from_options()

◆ operator<<() [1/2]

◆ operator<<() [2/2]

◆ operator>>()

◆ parse_duration()

◆ parse_nbytes()

◆ parse_nbytes_or_percent()

◆ parse_nbytes_unsigned()

◆ parse_optional()

◆ parse_string() [1/2]

◆ parse_string() [2/2]

◆ parse_string_list()

◆ partition_and_pack()

◆ partition_and_split()

◆ periodic_spill_check_from_options()

◆ safe_cast()

◆ safe_div()

◆ spill_partitions()

◆ split_and_pack()

◆ str() [1/3]

◆ str() [2/3]

◆ str() [3/3]

◆ stream_pool_from_options()

◆ to_lower()

◆ to_string()

◆ to_upper()

◆ to_vector()

◆ trim()

◆ unpack_and_concat()

◆ unspill_partitions()

Variable Documentation

◆ MEMORY_TYPE_NAMES

◆ MEMORY_TYPES

◆ SPILL_TARGET_MEMORY_TYPES