device_memory_resource derived class that uses cudaMallocAsync/cudaFreeAsync for allocation/deallocation.
More...
#include <cuda_async_memory_resource.hpp>
|
| | cuda_async_memory_resource (std::optional< std::size_t > initial_pool_size={}, std::optional< std::size_t > release_threshold={}, std::optional< allocation_handle_type > export_handle_type={}) |
| | Constructs a cuda_async_memory_resource with the optionally specified initial pool size and release threshold. More...
|
| |
| cudaMemPool_t | pool_handle () const noexcept |
| | Returns the underlying native handle to the CUDA pool. More...
|
| |
|
| cuda_async_memory_resource (cuda_async_memory_resource const &)=delete |
| |
|
| cuda_async_memory_resource (cuda_async_memory_resource &&)=delete |
| |
|
cuda_async_memory_resource & | operator= (cuda_async_memory_resource const &)=delete |
| |
|
cuda_async_memory_resource & | operator= (cuda_async_memory_resource &&)=delete |
| |
|
| device_memory_resource (device_memory_resource const &)=default |
| | Default copy constructor.
|
| |
|
| device_memory_resource (device_memory_resource &&) noexcept=default |
| | Default move constructor.
|
| |
| device_memory_resource & | operator= (device_memory_resource const &)=default |
| | Default copy assignment operator. More...
|
| |
| device_memory_resource & | operator= (device_memory_resource &&) noexcept=default |
| | Default move assignment operator. More...
|
| |
| void * | allocate_sync (std::size_t bytes, std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT) |
| | Allocates memory of size at least bytes. More...
|
| |
| void | deallocate_sync (void *ptr, std::size_t bytes, [[maybe_unused]] std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT) noexcept |
| | Deallocate memory pointed to by p. More...
|
| |
| void * | allocate (cuda_stream_view stream, std::size_t bytes, std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT) |
| | Allocates memory of size at least bytes on the specified stream. More...
|
| |
| void | deallocate (cuda_stream_view stream, void *ptr, std::size_t bytes, [[maybe_unused]] std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT) noexcept |
| | Deallocate memory pointed to by ptr on the specified stream. More...
|
| |
| bool | is_equal (device_memory_resource const &other) const noexcept |
| | Compare this resource to another. More...
|
| |
| bool | operator== (device_memory_resource const &other) const noexcept |
| | Comparison operator with another device_memory_resource. More...
|
| |
| bool | operator!= (device_memory_resource const &other) const noexcept |
| | Comparison operator with another device_memory_resource. More...
|
| |
device_memory_resource derived class that uses cudaMallocAsync/cudaFreeAsync for allocation/deallocation.
◆ allocation_handle_type
Flags for specifying memory allocation handle types.
- Note
- These values are exact copies from
cudaMemAllocationHandleType. We need a placeholder that can be used consistently in the constructor of cuda_async_memory_resource with all supported versions of CUDA. See the cudaMemAllocationHandleType docs at https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html and ensure the enum values are kept in sync with the CUDA documentation.
-
cudaMemHandleTypeFabric can be used instead of 0x8 once we require CUDA 12.4+.
| Enumerator |
|---|
| none | Does not allow any export mechanism.
|
| posix_file_descriptor | Allows a file descriptor to be used for exporting. Permitted only on POSIX systems.
|
| win32 | Allows a Win32 NT handle to be used for exporting. (HANDLE)
|
| win32_kmt | Allows a Win32 KMT handle to be used for exporting. (D3DKMT_HANDLE)
|
| fabric | Allows a fabric handle to be used for exporting. (cudaMemFabricHandle_t)
|
◆ mempool_usage
◆ cuda_async_memory_resource()
| rmm::mr::cuda_async_memory_resource::cuda_async_memory_resource |
( |
std::optional< std::size_t > |
initial_pool_size = {}, |
|
|
std::optional< std::size_t > |
release_threshold = {}, |
|
|
std::optional< allocation_handle_type > |
export_handle_type = {} |
|
) |
| |
|
inline |
Constructs a cuda_async_memory_resource with the optionally specified initial pool size and release threshold.
If the pool size grows beyond the release threshold, unused memory held by the pool will be released at the next synchronization event.
- Exceptions
-
- Parameters
-
| initial_pool_size | Optional initial size in bytes of the pool. If provided, the pool will be primed by allocating and immediately deallocating this amount of memory on the default CUDA stream. |
| release_threshold | Optional release threshold size in bytes of the pool. If no value is provided, the release threshold is set to the total amount of memory on the current device. |
| export_handle_type | Optional cudaMemAllocationHandleType that allocations from this resource should support interprocess communication (IPC). Default is cudaMemHandleTypeNone for no IPC support. |
◆ pool_handle()
| cudaMemPool_t rmm::mr::cuda_async_memory_resource::pool_handle |
( |
| ) |
const |
|
inlinenoexcept |
Returns the underlying native handle to the CUDA pool.
- Returns
- cudaMemPool_t Handle to the underlying CUDA pool
The documentation for this class was generated from the following file: