API Reference¶
High-level API¶
- class rmm.rmm.RMMNumbaManager(*args, **kwargs)¶
Bases:
numba.cuda.cudadrv.driver.HostOnlyCUDAMemoryManager
External Memory Management Plugin implementation for Numba. Provides on-device allocation only.
See http://numba.pydata.org/numba-doc/latest/cuda/external-memory.html for details of the interface being implemented here.
- Attributes
interface_version
Returns an integer specifying the version of the EMM Plugin interface supported by the plugin implementation.
Methods
defer_cleanup
()Returns a context manager that disables cleanup of mapped or pinned host memory in the current context whilst it is active.
get_ipc_handle
(memory)Get an IPC handle for the MemoryPointer memory with offset modified by the RMM memory pool.
Returns
(free, total)
memory in bytes in the context.Perform any initialization required for the EMM plugin instance to be ready to use.
memalloc
(size)Allocate an on-device array from the RMM pool.
memhostalloc
(size[, mapped, portable, wc])Implements the allocation of pinned host memory.
mempin
(owner, pointer, size[, mapped])Implements the pinning of host memory.
reset
()Clears up all host memory (mapped and/or pinned) in the current context.
memallocmanaged
- get_ipc_handle(memory)¶
Get an IPC handle for the MemoryPointer memory with offset modified by the RMM memory pool.
- get_memory_info()¶
Returns
(free, total)
memory in bytes in the context. May raiseNotImplementedError
, if returning such information is not practical (e.g. for a pool allocator).- Returns
Memory info
- Return type
MemoryInfo
- initialize()¶
Perform any initialization required for the EMM plugin instance to be ready to use.
- Returns
None
- property interface_version¶
Returns an integer specifying the version of the EMM Plugin interface supported by the plugin implementation. Should always return 1 for implementations of this version of the specification.
- memalloc(size)¶
Allocate an on-device array from the RMM pool.
- rmm.rmm.is_initialized()¶
Returns true if RMM has been initialized, false otherwise
- rmm.rmm.reinitialize(pool_allocator=False, managed_memory=False, initial_pool_size=None, maximum_pool_size=None, devices=0, logging=False, log_file_name=None)¶
Finalizes and then initializes RMM using the options passed. Using memory from a previous initialization of RMM is undefined behavior and should be avoided.
- Parameters
- pool_allocatorbool, default False
If True, use a pool allocation strategy which can greatly improve performance.
- managed_memorybool, default False
If True, use managed memory for device memory allocation
- initial_pool_sizeint, default None
When pool_allocator is True, this indicates the initial pool size in bytes. By default, 1/2 of the total GPU memory is used. When pool_allocator is False, this argument is ignored if provided.
- maximum_pool_sizeint, default None
When pool_allocator is True, this indicates the maximum pool size in bytes. By default, the total available memory on the GPU is used. When pool_allocator is False, this argument is ignored if provided.
- devicesint or List[int], default 0
GPU device IDs to register. By default registers only GPU 0.
- loggingbool, default False
If True, enable run-time logging of all memory events (alloc, free, realloc). This has significant performance impact.
- log_file_namestr
Name of the log file. If not specified, the environment variable RMM_LOG_FILE is used. A ValueError is thrown if neither is available. A separate log file is produced for each device, and the suffix “.dev{id}” is automatically added to the log file name.
Notes
Note that if you use the environment variable CUDA_VISIBLE_DEVICES with logging enabled, the suffix may not be what you expect. For example, if you set CUDA_VISIBLE_DEVICES=1, the log file produced will still have suffix 0. Similarly, if you set CUDA_VISIBLE_DEVICES=1,0 and use devices 0 and 1, the log file with suffix 0 will correspond to the GPU with device ID 1. Use rmm.get_log_filenames() to get the log file names corresponding to each device.
- rmm.rmm.rmm_cupy_allocator(nbytes)¶
A CuPy allocator that makes use of RMM.
Examples
>>> import rmm >>> import cupy >>> cupy.cuda.set_allocator(rmm.rmm_cupy_allocator)
Memory Resources¶
- class rmm.mr.BinningMemoryResource(DeviceMemoryResource upstream_mr, int8_t min_size_exponent=-1, int8_t max_size_exponent=-1)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Allocates memory from a set of specified “bin” sizes based on a specified allocation size.
If min_size_exponent and max_size_exponent are specified, initializes with one or more FixedSizeMemoryResource bins in the range [2^min_size_exponent, 2^max_size_exponent].
Call add_bin to add additional bin allocators.
- Parameters
- upstream_mrDeviceMemoryResource
The memory resource to use for allocations larger than any of the bins
- min_size_exponentsize_t
The base-2 exponent of the minimum size FixedSizeMemoryResource bin to create.
- max_size_exponentsize_t
The base-2 exponent of the maximum size FixedSizeMemoryResource bin to create.
- Attributes
- bin_mrs
- upstream_mr
Methods
add_bin
(self, size_t allocation_size, ...)Adds a bin of the specified maximum allocation size to this memory resource.
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)get_upstream
(self)- add_bin(self, size_t allocation_size, DeviceMemoryResource bin_resource=None)¶
Adds a bin of the specified maximum allocation size to this memory resource. If specified, uses bin_resource for allocation for this bin. If not specified, creates and uses a FixedSizeMemoryResource for allocation for this bin.
Allocations smaller than allocation_size and larger than the next smaller bin size will use this fixed-size memory resource.
- Parameters
- allocation_sizesize_t
The maximum allocation size in bytes for the created bin
- bin_resourceDeviceMemoryResource
The resource to use for this bin (optional)
- bin_mrs¶
!! processed by numpydoc !!
- class rmm.mr.CallbackMemoryResource(allocate_func, deallocate_func)¶
Bases:
rmm._lib.memory_resource.DeviceMemoryResource
A memory resource that uses the user-provided callables to do memory allocation and deallocation.
CallbackMemoryResource
should really only be used for debugging memory issues, as there is a significant performance penalty associated with using a Python function for each memory allocation and deallocation.- Parameters
- allocate_func: callable
The allocation function must accept a single integer argument, representing the number of bytes to allocate, and return an integer representing the pointer to the allocated memory.
- deallocate_func: callable
The deallocation function must accept two arguments, an integer representing the pointer to the memory to free, and a second integer representing the number of bytes to free.
- Examples
- ——-
- >>> import rmm
- >>> base_mr = rmm.mr.CudaMemoryResource()
- >>> def allocate_func(size):
- … print(f”Allocating {size} bytes”)
- … return base_mr.allocate(size)
- …
- >>> def deallocate_func(ptr, size):
- … print(f”Deallocating {size} bytes”)
- … return base_mr.deallocate(ptr, size)
- …
- >>> rmm.mr.set_current_device_resource(
rmm.mr.CallbackMemoryResource(allocate_func, deallocate_func)
- )
- >>> dbuf = rmm.DeviceBuffer(size=256)
- Allocating 256 bytes
- >>> del dbuf
- Deallocating 256 bytes
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)
- class rmm.mr.CudaAsyncMemoryResource¶
Bases:
rmm._lib.memory_resource.DeviceMemoryResource
Memory resource that uses cudaMallocAsync/Free for allocation/deallocation.
- Parameters
- initial_pool_sizeint,optional
Initial pool size in bytes. By default, half the available memory on the device is used.
- release_threshold: int, optional
Release threshold in bytes. If the pool size grows beyond this value, unused memory held by the pool will be released at the next synchronization point.
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)
- class rmm.mr.CudaMemoryResource¶
Bases:
rmm._lib.memory_resource.DeviceMemoryResource
Memory resource that uses cudaMalloc/Free for allocation/deallocation
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)
- class rmm.mr.DeviceMemoryResource¶
Bases:
object
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)- allocate(self, size_t nbytes)¶
- deallocate(self, uintptr_t ptr, size_t nbytes)¶
- class rmm.mr.FailureCallbackResourceAdaptor(DeviceMemoryResource upstream_mr, callback)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Memory resource that call callback when memory allocation fails.
- Parameters
- upstreamDeviceMemoryResource
The upstream memory resource.
- callbackcallable
Function called when memory allocation fails.
- Attributes
- upstream_mr
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)get_upstream
(self)
- class rmm.mr.FixedSizeMemoryResource(DeviceMemoryResource upstream_mr, size_t block_size=1048576, size_t blocks_to_preallocate=128)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Memory resource which allocates memory blocks of a single fixed size.
- Parameters
- upstream_mrDeviceMemoryResource
The DeviceMemoryResource from which to allocate blocks for the pool.
- block_sizeint, optional
The size of blocks to allocate (default is 1MiB).
- blocks_to_preallocateint, optional
The number of blocks to allocate to initialize the pool.
Notes
Supports only allocations of size smaller than the configured block_size.
- Attributes
- upstream_mr
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)get_upstream
(self)
- class rmm.mr.LoggingResourceAdaptor(DeviceMemoryResource upstream_mr, log_file_name=None)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Memory resource that logs information about allocations/deallocations performed by an upstream memory resource.
- Parameters
- upstreamDeviceMemoryResource
The upstream memory resource.
- log_file_namestr
Path to the file to which logs are written.
- Attributes
- upstream_mr
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)flush
(self)get_file_name
(self)get_upstream
(self)- flush(self)¶
- get_file_name(self)¶
- class rmm.mr.ManagedMemoryResource¶
Bases:
rmm._lib.memory_resource.DeviceMemoryResource
Memory resource that uses cudaMallocManaged/Free for allocation/deallocation.
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)
- class rmm.mr.PoolMemoryResource(DeviceMemoryResource upstream_mr, initial_pool_size=None, maximum_pool_size=None)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Coalescing best-fit suballocator which uses a pool of memory allocated from an upstream memory resource.
- Parameters
- upstream_mrDeviceMemoryResource
The DeviceMemoryResource from which to allocate blocks for the pool.
- initial_pool_sizeint,optional
Initial pool size in bytes. By default, half the available memory on the device is used.
- maximum_pool_sizeint, optional
Maximum size in bytes, that the pool can grow to.
- Attributes
- upstream_mr
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)get_upstream
(self)pool_size
(self)- pool_size(self)¶
- class rmm.mr.StatisticsResourceAdaptor(DeviceMemoryResource upstream_mr)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Memory resource that tracks the current, peak and total allocations/deallocations performed by an upstream memory resource. Includes the ability to query these statistics at any time.
- Parameters
- upstreamDeviceMemoryResource
The upstream memory resource.
- Attributes
allocation_counts
Gets the current, peak, and total allocated bytes and number of allocations.
- upstream_mr
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)get_upstream
(self)
- class rmm.mr.TrackingResourceAdaptor(DeviceMemoryResource upstream_mr, bool capture_stacks=False)¶
Bases:
rmm._lib.memory_resource.UpstreamResourceAdaptor
Memory resource that logs tracks allocations/deallocations performed by an upstream memory resource. Includes the ability to query all outstanding allocations with the stack trace, if desired.
- Parameters
- upstreamDeviceMemoryResource
The upstream memory resource.
- capture_stacksbool
Whether or not to capture the stack trace with each allocation.
- Attributes
- upstream_mr
Methods
allocate
(self, size_t nbytes)deallocate
(self, uintptr_t ptr, size_t nbytes)get_allocated_bytes
(self)Query the number of bytes that have been allocated.
Returns a string containing information about the current outstanding allocations.
get_upstream
(self)Logs the output of get_outstanding_allocations_str to the current RMM log file if enabled.
- get_allocated_bytes(self) size_t ¶
Query the number of bytes that have been allocated. Note that this can not be used to know how large of an allocation is possible due to both possible fragmentation and also internal page sizes and alignment that is not tracked by this allocator.
- get_outstanding_allocations_str(self) str ¶
Returns a string containing information about the current outstanding allocations. For each allocation, the address, size and optional stack trace are shown.
- log_outstanding_allocations(self)¶
Logs the output of get_outstanding_allocations_str to the current RMM log file if enabled.
- rmm.mr.disable_logging()¶
Disable logging if it was enabled previously using rmm.initialize() or rmm.enable_logging().
- rmm.mr.enable_logging(log_file_name=None)¶
Enable logging of run-time events for all devices.
- Parameters
- log_file_name: str, optional
Name of the log file. If not specified, the environment variable RMM_LOG_FILE is used. A ValueError is thrown if neither is available. A separate log file is produced for each device, and the suffix “.dev{id}” is automatically added to the log file name.
Notes
Note that if you use the environment variable CUDA_VISIBLE_DEVICES with logging enabled, the suffix may not be what you expect. For example, if you set CUDA_VISIBLE_DEVICES=1, the log file produced will still have suffix 0. Similarly, if you set CUDA_VISIBLE_DEVICES=1,0 and use devices 0 and 1, the log file with suffix 0 will correspond to the GPU with device ID 1. Use rmm.get_log_filenames() to get the log file names corresponding to each device.
- rmm.mr.get_current_device_resource() DeviceMemoryResource ¶
Get the memory resource used for RMM device allocations on the current device.
If the returned memory resource is used when a different device is the active CUDA device, behavior is undefined.
- rmm.mr.get_current_device_resource_type()¶
Get the memory resource type used for RMM device allocations on the current device.
- rmm.mr.get_log_filenames()¶
Returns the log filename (or None if not writing logs) for each device in use.
Examples
>>> import rmm >>> rmm.reinitialize(devices=[0, 1], logging=True, log_file_name="rmm.log") >>> rmm.get_log_filenames() {0: '/home/user/workspace/rapids/rmm/python/rmm.dev0.log', 1: '/home/user/workspace/rapids/rmm/python/rmm.dev1.log'}
- rmm.mr.get_per_device_resource(int device)¶
Get the default memory resource for the specified device.
If the returned memory resource is used when a different device is the active CUDA device, behavior is undefined.
- Parameters
- deviceint
The ID of the device for which to get the memory resource.
- rmm.mr.get_per_device_resource_type(int device)¶
Get the memory resource type used for RMM device allocations on the specified device.
- Parameters
- deviceint
The device ID
- rmm.mr.is_initialized()¶
Check whether RMM is initialized
- rmm.mr.set_current_device_resource(DeviceMemoryResource mr)¶
Set the default memory resource for the current device.
- Parameters
- mrDeviceMemoryResource
The memory resource to set. Must have been created while the current device is the active CUDA device.
- rmm.mr.set_per_device_resource(int device, DeviceMemoryResource mr)¶
Set the default memory resource for the specified device.
- Parameters
- deviceint
The ID of the device for which to get the memory resource.
- mrDeviceMemoryResource
The memory resource to set. Must have been created while device was the active CUDA device.