API Reference

Module Contents

class rmm.DeviceBuffer

Bases: object

Attributes:

nbytes: Gets the size of the buffer in bytes.
ptr: Gets a pointer to the underlying data.
size: Gets the size of the buffer in bytes.

Methods

`capacity`(self)
`copy`(self)	Returns a copy of DeviceBuffer.
`copy_from_device`(self, cuda_ary, ...)	Copy from a buffer on host to `self`
`copy_from_host`(self, ary, ...)	Copy from a buffer on host to `self`
`copy_to_host`(self[, ary])	Copy from a `DeviceBuffer` to a buffer on host.
`prefetch`(self[, device, stream])	Prefetch buffer data to the specified device on the specified stream.
`reserve`(self, size_t new_capacity, ...)
`resize`(self, size_t new_size, ...)
`to_device`(const unsigned char[, ...)	Calls `to_device` function on arguments provided.
`tobytes`(self, Stream stream=DEFAULT_STREAM)

capacity(self) → size_t

copy(self)

Returns a copy of DeviceBuffer.

Returns:

A deep copy of existing DeviceBuffer

Examples

>>> import rmm
>>> db = rmm.DeviceBuffer.to_device(b"abc")
>>> db_copy = db.copy()
>>> db.copy_to_host()
array([97, 98, 99], dtype=uint8)
>>> db_copy.copy_to_host()
array([97, 98, 99], dtype=uint8)
>>> assert db is not db_copy
>>> assert db.ptr != db_copy.ptr

copy_from_device(self, cuda_ary, Stream stream=DEFAULT_STREAM)

Copy from a buffer on host to self

Parameters:

cuda_aryobject to copy from that has __cuda_array_interface__
streamCUDA stream to use for copying, default the default stream

Examples

>>> import rmm
>>> db = rmm.DeviceBuffer(size=5)
>>> db2 = rmm.DeviceBuffer.to_device(b"abc")
>>> db.copy_from_device(db2)
>>> hb = db.copy_to_host()
>>> print(hb)
array([97, 98, 99,  0,  0], dtype=uint8)

copy_from_host(self, ary, Stream stream=DEFAULT_STREAM)

Copy from a buffer on host to self

Parameters:

arybytes-like buffer to copy from
streamCUDA stream to use for copying, default the default stream

Examples

>>> import rmm
>>> db = rmm.DeviceBuffer(size=10)
>>> hb = b"abcdef"
>>> db.copy_from_host(hb)
>>> hb = db.copy_to_host()
>>> print(hb)
array([97, 98, 99,  0,  0,  0,  0,  0,  0,  0], dtype=uint8)

copy_to_host(self, ary=None, Stream stream=DEFAULT_STREAM)

Copy from a DeviceBuffer to a buffer on host.

Parameters:

arybytes-like buffer to write into
streamCUDA stream to use for copying, default the default stream

Examples

>>> import rmm
>>> db = rmm.DeviceBuffer.to_device(b"abc")
>>> hb = bytearray(db.nbytes)
>>> db.copy_to_host(hb)
>>> print(hb)
bytearray(b'abc')
>>> hb = db.copy_to_host()
>>> print(hb)
bytearray(b'abc')

nbytes: Gets the size of the buffer in bytes.

prefetch(self, device=None, stream=None)

Prefetch buffer data to the specified device on the specified stream.

Assumes the storage for this DeviceBuffer is CUDA managed memory (unified memory). If it is not, this function is a no-op.

Parameters:

deviceoptional: The CUDA device to which to prefetch the memory for this buffer. Defaults to the current CUDA device. To prefetch to the CPU, pass cudaCpuDeviceId as the device.
streamoptional: CUDA stream to use for prefetching. Defaults to self.stream

ptr: Gets a pointer to the underlying data.

reserve(self, size_t new_capacity, Stream stream=DEFAULT_STREAM) → void

resize(self, size_t new_size, Stream stream=DEFAULT_STREAM) → void

size: Gets the size of the buffer in bytes.

static to_device(const unsigned char[::1] b, Stream stream=DEFAULT_STREAM): Calls to_device function on arguments provided.

tobytes(self, Stream stream=DEFAULT_STREAM) → bytes

exception rmm.RMMError(errcode, msg): Bases: Exception

rmm.disable_logging(): Disable logging if it was enabled previously using rmm.initialize() or rmm.enable_logging().

rmm.enable_logging(log_file_name=None)

Enable logging of run-time events for all devices.

Parameters:

log_file_name: str, optional: Name of the log file. If not specified, the environment variable RMM_LOG_FILE is used. A ValueError is thrown if neither is available. A separate log file is produced for each device, and the suffix “.dev{id}” is automatically added to the log file name.

Notes

Note that if you use the environment variable CUDA_VISIBLE_DEVICES with logging enabled, the suffix may not be what you expect. For example, if you set CUDA_VISIBLE_DEVICES=1, the log file produced will still have suffix 0. Similarly, if you set CUDA_VISIBLE_DEVICES=1,0 and use devices 0 and 1, the log file with suffix 0 will correspond to the GPU with device ID 1. Use rmm.get_log_filenames() to get the log file names corresponding to each device.

rmm.flush_logger()

Flush the debug logger. This will cause any buffered log messages to be written to the log file.

Debug logging prints messages to a log file. See Debug Logging for more information.

See also

set_flush_level: Set the flush level for the debug logger.
get_flush_level: Get the current debug logging flush level.

Examples

>>> import rmm
>>> rmm.flush_logger() # flush the logger

rmm.get_flush_level()

Get the current debug logging flush level for the RMM logger. Messages of this level or higher will automatically flush to the file.

Debug logging prints messages to a log file. See Debug Logging for more information.

Returns:

logging_level: The current flush level, an instance of the logging_level enum.

See also

set_flush_level: Set the flush level for the logger.
flush_logger: Flush the logger.

Examples

>>> import rmm
>>> rmm.flush_level() # get current flush level
<logging_level.INFO: 2>

rmm.get_log_filenames()

Returns the log filename (or None if not writing logs) for each device in use.

Examples

>>> import rmm
>>> rmm.reinitialize(devices=[0, 1], logging=True, log_file_name="rmm.log")
>>> rmm.get_log_filenames()
{0: '/home/user/workspace/rapids/rmm/python/rmm.dev0.log',
 1: '/home/user/workspace/rapids/rmm/python/rmm.dev1.log'}

rmm.get_logging_level()

Get the current debug logging level.

Debug logging prints messages to a log file. See Debug Logging for more information.

Returns:

levellogging_level: The current debug logging level, an instance of the logging_level enum.

See also

set_logging_level: Set the debug logging level.

Examples

>>> import rmm
>>> rmm.get_logging_level() # get current logging level
<logging_level.INFO: 2>

rmm.is_initialized(): Returns True if RMM has been initialized, False otherwise.

class rmm.level_enum(*values)

Bases: IntEnum

Attributes:

denominator: the denominator of a rational number in lowest terms
imag: the imaginary part of a complex number
numerator: the numerator of a rational number in lowest terms
real: the real part of a complex number

Methods

`as_integer_ratio`(/)	Return a pair of integers, whose ratio is equal to the original int.
`bit_count`(/)	Number of ones in the binary representation of the absolute value of self.
`bit_length`(/)	Number of bits necessary to represent self in binary.
`conjugate`(/)	Returns self, the complex conjugate of any int.
`from_bytes`(/, bytes[, byteorder, signed])	Return the integer represented by the given array of bytes.
`is_integer`(/)	Returns True.
`to_bytes`(/[, length, byteorder, signed])	Return an array of bytes representing an integer.

critical = 5

debug = 1

error = 4

info = 2

n_levels = 7

off = 6

trace = 0

warn = 3

rmm.register_reinitialize_hook(func, *args, **kwargs)

Add a function to the list of functions (“hooks”) that will be called before reinitialize().

A user or library may register hooks to perform any necessary cleanup before RMM is reinitialized. For example, a library with an internal cache of objects that use device memory allocated by RMM can register a hook to release those references before RMM is reinitialized, thus ensuring that the relevant device memory resource can be deallocated.

Hooks are called in the reverse order they are registered. This is useful, for example, when a library registers multiple hooks and needs them to run in a specific order for cleanup to be safe. Hooks cannot rely on being registered in a particular order relative to hooks registered by other packages, since that is determined by package import ordering.

Parameters:

funccallable: Function to be called before reinitialize()
args, kwargs: Positional and keyword arguments to be passed to func

rmm.reinitialize(pool_allocator=False, managed_memory=False, initial_pool_size=None, maximum_pool_size=None, devices=0, logging=False, log_file_name=None)

Finalizes and then initializes RMM using the options passed. Using memory from a previous initialization of RMM is undefined behavior and should be avoided.

Parameters:

pool_allocatorbool, default False: If True, use a pool allocation strategy which can greatly improve performance.
managed_memorybool, default False: If True, use managed memory for device memory allocation
initial_pool_sizeint | str, default None: When pool_allocator is True, this indicates the initial pool size in bytes. By default, 1/2 of the total GPU memory is used. When pool_allocator is False, this argument is ignored if provided. A string argument is parsed using parse_bytes.
maximum_pool_sizeint | str, default None: When pool_allocator is True, this indicates the maximum pool size in bytes. By default, the total available memory on the GPU is used. When pool_allocator is False, this argument is ignored if provided. A string argument is parsed using parse_bytes.
devicesint or List[int], default 0: GPU device IDs to register. By default registers only GPU 0.
loggingbool, default False: If True, enable run-time logging of all memory events (alloc, free, realloc). This has a significant performance impact.
log_file_namestr: Name of the log file. If not specified, the environment variable RMM_LOG_FILE is used. A ValueError is thrown if neither is available. A separate log file is produced for each device, and the suffix “.dev{id}” is automatically added to the log file name.

Notes

Note that if you use the environment variable CUDA_VISIBLE_DEVICES with logging enabled, the suffix may not be what you expect. For example, if you set CUDA_VISIBLE_DEVICES=1, the log file produced will still have suffix 0. Similarly, if you set CUDA_VISIBLE_DEVICES=1,0 and use devices 0 and 1, the log file with suffix 0 will correspond to the GPU with device ID 1. Use rmm.get_log_filenames() to get the log file names corresponding to each device.

rmm.set_flush_level(level)

Set the flush level for the debug logger. Messages of this level or higher will automatically flush to the file.

Debug logging prints messages to a log file. See Debug Logging for more information.

Parameters:

levellogging_level: The debug logging level. Valid values are instances of the logging_level enum.

Raises:

TypeError: If the logging level is not an instance of the logging_level enum.

See also

get_flush_level: Get the current debug logging flush level.
flush_logger: Flush the logger.

Examples

>>> import rmm
>>> rmm.flush_on(rmm.logging_level.WARN) # set flush level to warn

rmm.set_logging_level(level)

Set the debug logging level.

Debug logging prints messages to a log file. See Debug Logging for more information.

Parameters:

levellogging_level: The debug logging level. Valid values are instances of the logging_level enum.

Raises:

TypeError: If the logging level is not an instance of the logging_level enum.

See also

get_logging_level: Get the current debug logging level.

Examples

>>> import rmm
>>> rmm.set_logging_level(rmm.logging_level.WARN) # set logging level to warn

rmm.should_log(level)

Check if a message at the given level would be logged.

A message at the given level would be logged if the current debug logging level is set to a level that is at least as verbose than the given level, and the RMM module is compiled for a logging level at least as verbose. If these conditions are not both met, this function will return false.

Debug logging prints messages to a log file. See Debug Logging for more information.

Parameters:

levellogging_level: The debug logging level. Valid values are instances of the logging_level enum.

Returns:

should_logbool: True if a message at the given level would be logged, False otherwise.

Raises:

TypeError: If the logging level is not an instance of the logging_level enum.

rmm.unregister_reinitialize_hook(func)

Remove func from list of hooks that will be called before reinitialize().

If func was registered more than once, every instance of it will be removed from the list of hooks.

Memory Resources

class rmm.mr.ArenaMemoryResource(DeviceMemoryResource upstream_mr, arena_size=None, bool dump_log_on_failure=False)

Bases: UpstreamResourceAdaptor

A suballocator that emphasizes fragmentation avoidance and scalable concurrency support.

Parameters:

upstream_mrDeviceMemoryResource: The DeviceMemoryResource from which to allocate memory for arenas.
arena_sizeint, optional: Size in bytes of the global arena. Defaults to half of the available memory on the current device.
dump_log_on_failurebool, optional: Whether to dump the arena on allocation failure.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.BinningMemoryResource(DeviceMemoryResource upstream_mr, int8_t min_size_exponent=-1, int8_t max_size_exponent=-1)

Bases: UpstreamResourceAdaptor

Allocates memory from a set of specified “bin” sizes based on a specified allocation size.

If min_size_exponent and max_size_exponent are specified, initializes with one or more FixedSizeMemoryResource bins in the range [2**min_size_exponent, 2**max_size_exponent].

Call add_bin() to add additional bin allocators.

Parameters:

upstream_mrDeviceMemoryResource: The memory resource to use for allocations larger than any of the bins.
min_size_exponentsize_t: The base-2 exponent of the minimum size FixedSizeMemoryResource bin to create.
max_size_exponentsize_t: The base-2 exponent of the maximum size FixedSizeMemoryResource bin to create.

Attributes:

bin_mrs: BinningMemoryResource.bin_mrs: list
upstream_mr

Methods

`add_bin`(self, size_t allocation_size, ...)	Adds a bin of the specified maximum allocation size to this memory resource.
`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)

add_bin(self, size_t allocation_size, DeviceMemoryResource bin_resource=None)

Adds a bin of the specified maximum allocation size to this memory resource. If specified, uses bin_resource for allocation for this bin. If not specified, creates and uses a FixedSizeMemoryResource for allocation for this bin.

Allocations smaller than allocation_size and larger than the next smaller bin size will use this fixed-size memory resource.

Parameters:

allocation_sizesize_t: The maximum allocation size in bytes for the created bin
bin_resourceDeviceMemoryResource: The resource to use for this bin (optional)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

bin_mrs

BinningMemoryResource.bin_mrs: list

Get the list of binned memory resources.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.CallbackMemoryResource(allocate_func, deallocate_func)

Bases: DeviceMemoryResource

A memory resource that uses the user-provided callables to do memory allocation and deallocation.

CallbackMemoryResource should really only be used for debugging memory issues, as there is a significant performance penalty associated with using a Python function for each memory allocation and deallocation.

Parameters:

allocate_func: callable: The allocation function must accept two arguments. An integer representing the number of bytes to allocate and a Stream on which to perform the allocation, and return an integer representing the pointer to the allocated memory.
deallocate_func: callable: The deallocation function must accept three arguments. an integer representing the pointer to the memory to free, a second integer representing the number of bytes to free, and a Stream on which to perform the deallocation.

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

Examples

>>> import rmm
>>> base_mr = rmm.mr.CudaMemoryResource()
>>> def allocate_func(size, stream):
...     print(f"Allocating {size} bytes")
...     return base_mr.allocate(size, stream)
...
>>> def deallocate_func(ptr, size, stream):
...     print(f"Deallocating {size} bytes")
...     return base_mr.deallocate(ptr, size, stream)
...
>>> rmm.mr.set_current_device_resource(
    rmm.mr.CallbackMemoryResource(allocate_func, deallocate_func)
)
>>> dbuf = rmm.DeviceBuffer(size=256)
Allocating 256 bytes
>>> del dbuf
Deallocating 256 bytes

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.CudaAsyncMemoryResource

Bases: DeviceMemoryResource

Memory resource that uses cudaMallocAsync/cudaFreeAsync for allocation/deallocation.

Parameters:

initial_pool_sizeint | str, optional: Initial pool size in bytes. By default, half the available memory on the device is used. A string argument is parsed using parse_bytes.
release_threshold: int, optional: Release threshold in bytes. If the pool size grows beyond this value, unused memory held by the pool will be released at the next synchronization point.
enable_ipc: bool, optional: If True, enables export of POSIX file descriptor handles for the memory allocated by this resource so that it can be used with CUDA IPC.
enable_fabric: bool, optional: If True, enables export of fabric handles for the memory allocated by this resource.

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.CudaAsyncViewMemoryResource

Bases: DeviceMemoryResource

Memory resource that uses cudaMallocAsync/cudaFreeAsync for allocation/deallocation with an existing CUDA memory pool.

This resource uses an existing CUDA memory pool handle (such as the default pool) instead of creating a new one. This is useful for integrating with existing GPU applications that already use a CUDA memory pool, or customizing the flags used by the memory pool.

The memory pool passed in must not be destroyed during the lifetime of this memory resource.

Parameters:

pool_handlecudaMemPool_t or CUmemoryPool: Handle to a CUDA memory pool which will be used to serve allocation requests.

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`pool_handle`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

pool_handle(self)

class rmm.mr.CudaMemoryResource

Bases: DeviceMemoryResource

Memory resource that uses cudaMalloc/cudaFree for allocation/deallocation.

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.DeviceMemoryResource

Bases: object

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.FailureCallbackResourceAdaptor(DeviceMemoryResource upstream_mr, callback)

Bases: UpstreamResourceAdaptor

Memory resource that call callback when memory allocation fails.

Parameters:

upstreamDeviceMemoryResource: The upstream memory resource.
callbackcallable: Function called when memory allocation fails.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.FixedSizeMemoryResource(DeviceMemoryResource upstream_mr, size_t block_size=0x100000, size_t blocks_to_preallocate=128)

Bases: UpstreamResourceAdaptor

Memory resource which allocates memory blocks of a single fixed size.

Parameters:

upstream_mrDeviceMemoryResource: The DeviceMemoryResource from which to allocate blocks for the pool.
block_sizeint, optional: The size of blocks to allocate (default is 1MiB).
blocks_to_preallocateint, optional: The number of blocks to allocate to initialize the pool.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)

Notes

Supports only allocations of size smaller than the configured block_size.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.LimitingResourceAdaptor(DeviceMemoryResource upstream_mr, size_t allocation_limit)

Bases: UpstreamResourceAdaptor

Memory resource that limits the total allocation amount possible performed by an upstream memory resource.

Parameters:

upstream_mrDeviceMemoryResource: The upstream memory resource.
allocation_limitsize_t: Maximum memory allowed for this allocator.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_allocated_bytes`(self)	Query the number of bytes that have been allocated.
`get_allocation_limit`(self)	Query the maximum number of bytes that this allocator is allowed to allocate.
`get_upstream`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_allocated_bytes(self) → size_t: Query the number of bytes that have been allocated. Note that this can not be used to know how large of an allocation is possible due to both possible fragmentation and also internal page sizes and alignment that is not tracked by this allocator.

get_allocation_limit(self) → size_t: Query the maximum number of bytes that this allocator is allowed to allocate. This is the limit on the allocator and not a representation of the underlying device. The device may not be able to support this limit.

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.LoggingResourceAdaptor(DeviceMemoryResource upstream_mr, log_file_name=None)

Bases: UpstreamResourceAdaptor

Memory resource that logs information about allocations/deallocations performed by an upstream memory resource.

Parameters:

upstreamDeviceMemoryResource: The upstream memory resource.
log_file_namestr: Path to the file to which logs are written.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`flush`(self)
`get_file_name`(self)
`get_upstream`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

flush(self)

get_file_name(self)

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.ManagedMemoryResource

Bases: DeviceMemoryResource

Memory resource that uses cudaMallocManaged/cudaFree for allocation/deallocation.

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.PoolMemoryResource(DeviceMemoryResource upstream_mr, initial_pool_size=None, maximum_pool_size=None)

Bases: UpstreamResourceAdaptor

Coalescing best-fit suballocator which uses a pool of memory allocated from an upstream memory resource.

Parameters:

upstream_mrDeviceMemoryResource: The DeviceMemoryResource from which to allocate blocks for the pool.
initial_pool_sizeint | str, optional: Initial pool size in bytes. By default, half the available memory on the device is used.
maximum_pool_sizeint | str, optional: Maximum size in bytes, that the pool can grow to.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)
`pool_size`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

pool_size(self)

upstream_mr

class rmm.mr.PrefetchResourceAdaptor(DeviceMemoryResource upstream_mr)

Bases: UpstreamResourceAdaptor

Memory resource that prefetches all allocations.

Parameters:

upstreamDeviceMemoryResource: The upstream memory resource.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

upstream_mr

class rmm.mr.SamHeadroomMemoryResource(size_t headroom)

Bases: DeviceMemoryResource

Memory resource that uses malloc/free for allocation/deallocation.

Parameters:

headroomsize_t: Size of the reserved GPU memory as headroom

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.StatisticsResourceAdaptor(DeviceMemoryResource upstream_mr)

Bases: UpstreamResourceAdaptor

Memory resource that tracks the current, peak and total allocations/deallocations performed by an upstream memory resource. Includes the ability to query these statistics at any time.

A stack of counters is maintained. Use push_counters() and pop_counters() to track statistics at different nesting levels.

Parameters:

upstreamDeviceMemoryResource: The upstream memory resource.

Attributes:

allocation_counts: StatisticsResourceAdaptor.allocation_counts: Statistics
upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)
`pop_counters`(self)	Pop a counter pair (bytes and allocations) from the stack
`push_counters`(self)	Push a new counter pair (bytes and allocations) on the stack

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

allocation_counts

StatisticsResourceAdaptor.allocation_counts: Statistics

Gets the current, peak, and total allocated bytes and number of allocations.

The dictionary keys are current_bytes, current_count, peak_bytes, peak_count, total_bytes, and total_count.

Returns:: dict: Dictionary containing allocation counts and bytes.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

pop_counters(self) → Statistics

Pop a counter pair (bytes and allocations) from the stack

Returns:

The popped statistics

push_counters(self) → Statistics

Push a new counter pair (bytes and allocations) on the stack

Returns:

The statistics _before_ the push

upstream_mr

class rmm.mr.SystemMemoryResource

Bases: DeviceMemoryResource

Memory resource that uses malloc/free for allocation/deallocation.

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

class rmm.mr.TrackingResourceAdaptor(DeviceMemoryResource upstream_mr, bool capture_stacks=False)

Bases: UpstreamResourceAdaptor

Memory resource that logs tracks allocations/deallocations performed by an upstream memory resource. Includes the ability to query all outstanding allocations with the stack trace, if desired.

Parameters:

upstreamDeviceMemoryResource: The upstream memory resource.
capture_stacksbool: Whether or not to capture the stack trace with each allocation.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_allocated_bytes`(self)	Query the number of bytes that have been allocated.
`get_outstanding_allocations_str`(self)	Returns a string containing information about the current outstanding allocations.
`get_upstream`(self)
`log_outstanding_allocations`(self)	Logs the output of get_outstanding_allocations_str to the current RMM log file if enabled.

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_allocated_bytes(self) → size_t: Query the number of bytes that have been allocated. Note that this can not be used to know how large of an allocation is possible due to both possible fragmentation and also internal page sizes and alignment that is not tracked by this allocator.

get_outstanding_allocations_str(self) → str: Returns a string containing information about the current outstanding allocations. For each allocation, the address, size and optional stack trace are shown.

get_upstream(self) → DeviceMemoryResource

log_outstanding_allocations(self): Logs the output of get_outstanding_allocations_str to the current RMM log file if enabled.

upstream_mr

class rmm.mr.UpstreamResourceAdaptor

Bases: DeviceMemoryResource

Parent class for all memory resources that track an upstream.

Upstream resource tracking requires maintaining a reference to the upstream mr so that it is kept alive and may be accessed by any downstream resource adaptors.

Attributes:

upstream_mr

Methods

`allocate`(self, size_t nbytes, ...)	Allocate `nbytes` bytes of memory.
`deallocate`(self, uintptr_t ptr, ...)	Deallocate memory pointed to by `ptr` of size `nbytes`.
`get_upstream`(self)

allocate(self, size_t nbytes, Stream stream=DEFAULT_STREAM)

Allocate nbytes bytes of memory.

Parameters:

nbytessize_t: The size of the allocation in bytes
streamStream: Optional stream for the allocation

Raises:

MemoryError: If allocation fails.

deallocate(self, uintptr_t ptr, size_t nbytes, Stream stream=DEFAULT_STREAM)

Deallocate memory pointed to by ptr of size nbytes.

Parameters:

ptruintptr_t: Pointer to be deallocated
nbytessize_t: Size of the allocation in bytes
streamStream: Optional stream for the deallocation

get_upstream(self) → DeviceMemoryResource

upstream_mr

rmm.mr.available_device_memory(): Returns a tuple of free and total device memory memory.

rmm.mr.disable_logging(): Disable logging if it was enabled previously using rmm.initialize() or rmm.enable_logging().

rmm.mr.enable_logging(log_file_name=None)

Enable logging of run-time events for all devices.

Parameters:

log_file_name: str, optional: Name of the log file. If not specified, the environment variable RMM_LOG_FILE is used. A ValueError is thrown if neither is available. A separate log file is produced for each device, and the suffix “.dev{id}” is automatically added to the log file name.

Notes

Note that if you use the environment variable CUDA_VISIBLE_DEVICES with logging enabled, the suffix may not be what you expect. For example, if you set CUDA_VISIBLE_DEVICES=1, the log file produced will still have suffix 0. Similarly, if you set CUDA_VISIBLE_DEVICES=1,0 and use devices 0 and 1, the log file with suffix 0 will correspond to the GPU with device ID 1. Use rmm.get_log_filenames() to get the log file names corresponding to each device.

rmm.mr.get_current_device_resource() → DeviceMemoryResource

Get the memory resource used for RMM device allocations on the current device.

If the returned memory resource is used when a different device is the active CUDA device, behavior is undefined.

rmm.mr.get_current_device_resource_type(): Get the memory resource type used for RMM device allocations on the current device.

rmm.mr.get_log_filenames()

Returns the log filename (or None if not writing logs) for each device in use.

Examples

>>> import rmm
>>> rmm.reinitialize(devices=[0, 1], logging=True, log_file_name="rmm.log")
>>> rmm.get_log_filenames()
{0: '/home/user/workspace/rapids/rmm/python/rmm.dev0.log',
 1: '/home/user/workspace/rapids/rmm/python/rmm.dev1.log'}

rmm.mr.get_per_device_resource(int device)

Get the default memory resource for the specified device.

If the returned memory resource is used when a different device is the active CUDA device, behavior is undefined.

Parameters:

deviceint: The ID of the device for which to get the memory resource.

rmm.mr.get_per_device_resource_type(int device)

Get the memory resource type used for RMM device allocations on the specified device.

Parameters:

deviceint: The device ID

rmm.mr.is_initialized(): Check whether RMM is initialized

rmm.mr.set_current_device_resource(DeviceMemoryResource mr)

Set the default memory resource for the current device.

Parameters:

mrDeviceMemoryResource: The memory resource to set. Must have been created while the current device is the active CUDA device.

rmm.mr.set_per_device_resource(int device, DeviceMemoryResource mr)

Set the default memory resource for the specified device.

Parameters:

deviceint: The ID of the device for which to get the memory resource.
mrDeviceMemoryResource: The memory resource to set. Must have been created while device was the active CUDA device.

Memory Allocators

rmm.allocators.cupy.rmm_cupy_allocator(nbytes)

A CuPy allocator that makes use of RMM.

Examples

>>> from rmm.allocators.cupy import rmm_cupy_allocator
>>> import cupy
>>> cupy.cuda.set_allocator(rmm_cupy_allocator)

class rmm.allocators.numba.RMMNumbaManager(*args, **kwargs)

Bases: HostOnlyCUDAMemoryManager

External Memory Management Plugin implementation for Numba. Provides on-device allocation only.

See https://numba.readthedocs.io/en/stable/cuda/external-memory.html for details of the interface being implemented here.

Attributes:

interface_version: Returns an integer specifying the version of the EMM Plugin interface supported by the plugin implementation.

Methods

`defer_cleanup`()	Returns a context manager that disables cleanup of mapped or pinned host memory in the current context whilst it is active.
`get_ipc_handle`(memory)	Get an IPC handle for the MemoryPointer memory with offset modified by the RMM memory pool.
`get_memory_info`()	Returns `(free, total)` memory in bytes in the context.
`initialize`()	Perform any initialization required for the EMM plugin instance to be ready to use.
`memalloc`(size)	Allocate an on-device array from the RMM pool.
`memhostalloc`(size[, mapped, portable, wc])	Implements the allocation of pinned host memory.
`mempin`(owner, pointer, size[, mapped])	Implements the pinning of host memory.
`reset`()	Clears up all host memory (mapped and/or pinned) in the current context.

memallocmanaged

defer_cleanup()

Returns a context manager that disables cleanup of mapped or pinned host memory in the current context whilst it is active.

EMM Plugins that override this method must obtain the context manager from this method before yielding to ensure that cleanup of host allocations is also deferred.

get_ipc_handle(memory): Get an IPC handle for the MemoryPointer memory with offset modified by the RMM memory pool.

get_memory_info()

Returns (free, total) memory in bytes in the context.

This implementation raises NotImplementedError because the allocation will be performed using rmm’s currently set default mr, which may be a pool allocator.

initialize()

Perform any initialization required for the EMM plugin instance to be ready to use.

Returns:: None

property interface_version: Returns an integer specifying the version of the EMM Plugin interface supported by the plugin implementation. Should always return 1 for implementations of this version of the specification.

memalloc(size): Allocate an on-device array from the RMM pool.

memallocmanaged(size, attach_global)

memhostalloc(size, mapped=False, portable=False, wc=False)

Implements the allocation of pinned host memory.

It is recommended that this method is not overridden by EMM Plugin implementations - instead, use the numba.cuda.BaseCUDAMemoryManager.

mempin(owner, pointer, size, mapped=False)

Implements the pinning of host memory.

It is recommended that this method is not overridden by EMM Plugin implementations - instead, use the numba.cuda.BaseCUDAMemoryManager.

reset()

Clears up all host memory (mapped and/or pinned) in the current context.

EMM Plugins that override this method must call super().reset() to ensure that host allocations are also cleaned up.

Memory Statistics

class rmm.statistics.ProfilerRecords

Bases: object

Records of the memory statistics recorded by a profiler.

Attributes:

records: Dictionary mapping record names to their memory statistics.

Methods

`MemoryRecord`([num_calls, memory_total, ...])	Memory statistics of a single code block.
`add`(name, data)	Add memory statistics to the record named name.
`report`([ordered_by])	Pretty format the recorded memory statistics.

class MemoryRecord(num_calls: int = 0, memory_total: int = 0, memory_peak: int = 0)

Bases: object

Memory statistics of a single code block.

Attributes:

num_calls: Number of times this code block was invoked.
memory_total: Total number of bytes allocated.
memory_peak: Peak number of bytes allocated.

Methods

add

add(memory_total: int, memory_peak: int)

memory_peak: int = 0

memory_total: int = 0

num_calls: int = 0

add(name: str, data: Statistics) → None

Add memory statistics to the record named name.

This method is thread-safe.

Parameters:

name: Name of the record.
data: Memory statistics of name.

property records: dict[str, MemoryRecord]: Dictionary mapping record names to their memory statistics.

report(ordered_by: Literal['num_calls', 'memory_peak', 'memory_total'] = 'memory_peak') → str

Pretty format the recorded memory statistics.

Parameters:

ordered_by: Sort the statistics by this attribute.

Returns:

The pretty formatted string of the memory statistics

class rmm.statistics.Statistics(current_bytes: int, current_count: int, peak_bytes: int, peak_count: int, total_bytes: int, total_count: int)

Bases: object

Statistics returned by {get,push,pop}_statistics().

Attributes:

current_bytes: Current number of bytes allocated
current_count: Current number of allocations allocated
peak_bytes: Peak number of bytes allocated
peak_count: Peak number of allocations allocated
total_bytes: Total number of bytes allocated
total_count: Total number of allocations allocated

current_bytes: int

current_count: int

peak_bytes: int

peak_count: int

total_bytes: int

total_count: int

rmm.statistics.enable_statistics() → None

Enable allocation statistics.

This function is idempotent. If statistics have been enabled for the current RMM resource stack, this is a no-op.

Warning

This modifies the current RMM memory resource. StatisticsResourceAdaptor is pushed onto the current RMM memory resource stack and must remain the topmost resource throughout the statistics gathering.

rmm.statistics.get_statistics() → Statistics | None

Get the current allocation statistics.

Returns:

If enabled, returns the current tracked statistics.
If disabled, returns None.

rmm.statistics.pop_statistics() → Statistics | None

Pop the counters of the current allocation statistics stack.

This returns the counters of current tracked statistics and pops them from the stack.

If statistics are disabled (the current memory resource is not an instance of StatisticsResourceAdaptor), this function is a no-op.

Returns:

If enabled, returns the popped counters.
If disabled, returns None.

rmm.statistics.profiler(*, records: ProfilerRecords = ProfilerRecords({}), name: str = '')

Decorator and context to profile function or code block.

If statistics are enabled (the current memory resource is an instance of StatisticsResourceAdaptor), this decorator records the memory statistics of the decorated function or code block.

If statistics are disabled, this decorator/context is a no-op.

Parameters:

records: The profiler records that the memory statistics are written to. If not set, a default profiler records are used.
name: The name of the memory profile, mandatory when the profiler is used as a context manager. If used as a decorator, an empty name is allowed. In this case, the name is the filename, line number, and function name.

rmm.statistics.push_statistics() → Statistics | None

Push new counters on the current allocation statistics stack.

This returns the current tracked statistics and pushes a new set of zero counters on the stack of statistics.

If statistics are disabled (the current memory resource is not an instance of StatisticsResourceAdaptor), this function is a no-op.

Returns:

If enabled, returns the current tracked statistics _before_ the pop.
If disabled, returns None.

rmm.statistics.statistics()

Context to enable allocation statistics.

If statistics have been enabled already (the current memory resource is an instance of StatisticsResourceAdaptor), new counters are pushed on the current allocation statistics stack when entering the context and popped again when exiting using push_statistics() and push_statistics().

If statistics have not been enabled, a new StatisticsResourceAdaptor is set as the current RMM memory resource when entering the context and removed again when exiting.

Raises:

ValueError: If the current RMM memory source was changed while in the context.