device_memory_resource derived class that uses cudaMallocAsync/cudaFreeAsync for allocation/deallocation. More...

#include <cuda_async_memory_resource.hpp>

Inheritance diagram for rmm::mr::cuda_async_memory_resource:

Collaboration diagram for rmm::mr::cuda_async_memory_resource:

Public Types
enum class	allocation_handle_type : std::int32_t { none = cudaMemHandleTypeNone , posix_file_descriptor , win32 , win32_kmt = cudaMemHandleTypeWin32Kmt , fabric = 0x8 }
	Flags for specifying memory allocation handle types. More...

enum class	mempool_usage : unsigned short { hw_decompress = 0x2 }
	Flags for specifying memory pool usage. More...

Public Member Functions
	cuda_async_memory_resource (std::optional< std::size_t > initial_pool_size={}, std::optional< std::size_t > release_threshold={}, std::optional< allocation_handle_type > export_handle_type={})
	Constructs a cuda_async_memory_resource with the optionally specified initial pool size and release threshold. More...

cudaMemPool_t	pool_handle () const noexcept
	Returns the underlying native handle to the CUDA pool. More...

	cuda_async_memory_resource (cuda_async_memory_resource const &)=delete

	cuda_async_memory_resource (cuda_async_memory_resource &&)=delete

cuda_async_memory_resource &	operator= (cuda_async_memory_resource const &)=delete

cuda_async_memory_resource &	operator= (cuda_async_memory_resource &&)=delete

Public Member Functions inherited from rmm::mr::device_memory_resource
	device_memory_resource (device_memory_resource const &)=default
	Default copy constructor.

	device_memory_resource (device_memory_resource &&) noexcept=default
	Default move constructor.

device_memory_resource &	operator= (device_memory_resource const &)=default
	Default copy assignment operator. More...

device_memory_resource &	operator= (device_memory_resource &&) noexcept=default
	Default move assignment operator. More...

void *	allocate (std::size_t bytes, cuda_stream_view stream=cuda_stream_view{})
	Allocates memory of size at least `bytes`. More...

void	deallocate (void *ptr, std::size_t bytes, cuda_stream_view stream=cuda_stream_view{})
	Deallocate memory pointed to by `p`. More...

bool	is_equal (device_memory_resource const &other) const noexcept
	Compare this resource to another. More...

void *	allocate (std::size_t bytes, std::size_t alignment)
	Allocates memory of size at least `bytes`. More...

void	deallocate (void *ptr, std::size_t bytes, std::size_t alignment)
	Deallocate memory pointed to by `p`. More...

void *	allocate_async (std::size_t bytes, std::size_t alignment, cuda_stream_view stream)
	Allocates memory of size at least `bytes`. More...

void *	allocate_async (std::size_t bytes, cuda_stream_view stream)
	Allocates memory of size at least `bytes`. More...

void	deallocate_async (void *ptr, std::size_t bytes, std::size_t alignment, cuda_stream_view stream)
	Deallocate memory pointed to by `p`. More...

void	deallocate_async (void *ptr, std::size_t bytes, cuda_stream_view stream)
	Deallocate memory pointed to by `p`. More...

bool	operator== (device_memory_resource const &other) const noexcept
	Comparison operator with another device_memory_resource. More...

bool	operator!= (device_memory_resource const &other) const noexcept
	Comparison operator with another device_memory_resource. More...

Detailed Description

device_memory_resource derived class that uses cudaMallocAsync/cudaFreeAsync for allocation/deallocation.

Member Enumeration Documentation

◆ allocation_handle_type

enum rmm::mr::cuda_async_memory_resource::allocation_handle_type : std::int32_t

strong

Flags for specifying memory allocation handle types.

Note: These values are exact copies from cudaMemAllocationHandleType. We need a placeholder that can be used consistently in the constructor of cuda_async_memory_resource with all supported versions of CUDA. See the cudaMemAllocationHandleType docs at https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html and ensure the enum values are kept in sync with the CUDA documentation.; cudaMemHandleTypeFabric can be used instead of 0x8 once we require CUDA 12.4+.

Enumerator
none	Does not allow any export mechanism.
posix_file_descriptor	Allows a file descriptor to be used for exporting. Permitted only on POSIX systems.
win32	Allows a Win32 NT handle to be used for exporting. (HANDLE)
win32_kmt	Allows a Win32 KMT handle to be used for exporting. (D3DKMT_HANDLE)
fabric	Allows a fabric handle to be used for exporting. (cudaMemFabricHandle_t)

◆ mempool_usage

enum rmm::mr::cuda_async_memory_resource::mempool_usage : unsigned short

strong

Flags for specifying memory pool usage.

Note: These values are exact copies from the runtime API. See the cudaMemPoolProps docs at https://docs.nvidia.com/cuda/cuda-runtime-api/structcudaMemPoolProps.html and ensure the enum values are kept in sync with the CUDA documentation. cudaMemPoolCreateUsageHwDecompress is currently the only supported usage flag, introduced in CUDA 12.8 and documented in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html

Enumerator
hw_decompress	If set indicates that the memory can be used as a buffer for hardware accelerated decompression.

Constructor & Destructor Documentation

◆ cuda_async_memory_resource()

rmm::mr::cuda_async_memory_resource::cuda_async_memory_resource	(	std::optional< std::size_t >	initial_pool_size = `{}`,
		std::optional< std::size_t >	release_threshold = `{}`,
		std::optional< allocation_handle_type >	export_handle_type = `{}`
	)

inline

Constructs a cuda_async_memory_resource with the optionally specified initial pool size and release threshold.

If the pool size grows beyond the release threshold, unused memory held by the pool will be released at the next synchronization event.

Exceptions

rmm::logic_error if the CUDA version does not support cudaMallocAsync

Parameters

initial_pool_size	Optional initial size in bytes of the pool. If no value is provided, initial pool size is half of the available GPU memory.
release_threshold	Optional release threshold size in bytes of the pool. If no value is provided, the release threshold is set to the total amount of memory on the current device.
export_handle_type	Optional `cudaMemAllocationHandleType` that allocations from this resource should support interprocess communication (IPC). Default is `cudaMemHandleTypeNone` for no IPC support.

Member Function Documentation

◆ pool_handle()

cudaMemPool_t rmm::mr::cuda_async_memory_resource::pool_handle ( ) const

inlinenoexcept

Returns the underlying native handle to the CUDA pool.

Returns: cudaMemPool_t Handle to the underlying CUDA pool

The documentation for this class was generated from the following file:

cuda_async_memory_resource.hpp

Public Types

Public Member Functions