Base class for all librmm device memory allocation. More...

#include <device_memory_resource.hpp>

Inheritance diagram for rmm::mr::device_memory_resource:

[legend]

Public Member Functions
	device_memory_resource (device_memory_resource const &)=default
	Default copy constructor.

	device_memory_resource (device_memory_resource &&) noexcept=default
	Default move constructor.

device_memory_resource &	operator= (device_memory_resource const &)=default
	Default copy assignment operator. More...

device_memory_resource &	operator= (device_memory_resource &&) noexcept=default
	Default move assignment operator. More...

void *	allocate_sync (std::size_t bytes, std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT)
	Allocates memory of size at least `bytes`. More...

void	deallocate_sync (void *ptr, std::size_t bytes, [[maybe_unused]] std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT) noexcept
	Deallocate memory pointed to by `p`. More...

void *	allocate (cuda_stream_view stream, std::size_t bytes, std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT)
	Allocates memory of size at least `bytes` on the specified stream. More...

void	deallocate (cuda_stream_view stream, void *ptr, std::size_t bytes, [[maybe_unused]] std::size_t alignment=rmm::CUDA_ALLOCATION_ALIGNMENT) noexcept
	Deallocate memory pointed to by `ptr` on the specified stream. More...

bool	is_equal (device_memory_resource const &other) const noexcept
	Compare this resource to another. More...

bool	operator== (device_memory_resource const &other) const noexcept
	Comparison operator with another device_memory_resource. More...

bool	operator!= (device_memory_resource const &other) const noexcept
	Comparison operator with another device_memory_resource. More...

Friends
void	get_property (device_memory_resource const &, cuda::mr::device_accessible) noexcept
	Enables the `cuda::mr::device_accessible` property. More...

Detailed Description

Base class for all librmm device memory allocation.

This class serves as the interface that all custom device memory implementations must satisfy.

There are two private, pure virtual functions that all derived classes must implement: do_allocate and do_deallocate. Optionally, derived classes may also override is_equal. By default, is_equal simply performs an identity comparison.

The public, non-virtual functions allocate, deallocate, and is_equal simply call the private virtual functions. The reason for this is to allow implementing shared, default behavior in the base class. For example, the base class' allocate function may log every allocation, no matter what derived class implementation is used.

The allocate and deallocate APIs and implementations provide stream-ordered memory allocation. This allows optimizations such as re-using memory deallocated on the same stream without the overhead of stream synchronization.

A call to allocate(bytes, stream_a) (on any derived class) returns a pointer that is valid to use on stream_a. Using the memory on a different stream (say stream_b) is Undefined Behavior unless the two streams are first synchronized, for example by using cudaStreamSynchronize(stream_a) or by recording a CUDA event on stream_a and then calling cudaStreamWaitEvent(stream_b, event).

The stream specified to deallocate() should be a stream on which it is valid to use the deallocated memory immediately for another allocation. Typically this is the stream on which the allocation was last used before the call to deallocate(). The passed stream may be used internally by a device_memory_resource for managing available memory with minimal synchronization, and it may also be synchronized at a later time, for example using a call to cudaStreamSynchronize().

For this reason, it is Undefined Behavior to destroy a CUDA stream that is passed to deallocate(). If the stream on which the allocation was last used has been destroyed before calling deallocate() or it is known that it will be destroyed, it is likely better to synchronize the stream (before destroying it) and then pass a different stream to deallocate() (e.g. the default stream).

A device_memory_resource should only be used when the active CUDA device is the same device that was active when the device_memory_resource was created. Otherwise behavior is undefined.

Creating a device_memory_resource for each device requires care to set the current device before creating each resource, and to maintain the lifetime of the resources as long as they are set as per-device resources. Here is an example loop that creates unique_ptrs to pool_memory_resource objects for each device and sets them as the per-device resource for that device.

using pool_mr = rmm::mr::pool_memory_resource<rmm::mr::cuda_memory_resource>;
std::vector<unique_ptr<pool_mr>> per_device_pools;
for(int i = 0; i < N; ++i) {
  cudaSetDevice(i);
  // Note: for brevity, omitting creation of upstream and computing initial_size
  per_device_pools.push_back(std::make_unique<pool_mr>(upstream, initial_size));
  set_per_device_resource(cuda_device_id{i}, &per_device_pools.back());
}

Member Function Documentation

◆ allocate()

void* rmm::mr::device_memory_resource::allocate	(	cuda_stream_view	stream,
		std::size_t	bytes,
		std::size_t	alignment = `rmm::CUDA_ALLOCATION_ALIGNMENT`
	)

inline

Allocates memory of size at least bytes on the specified stream.

The returned pointer will have 256 byte alignment regardless of the value of alignment. Higher alignments must use the aligned_resource_adaptor.

Exceptions

rmm::bad_alloc When the requested bytes cannot be allocated.

Parameters

stream	The stream on which to perform the allocation
bytes	The size of the allocation
alignment	The alignment of the allocation (see notes above)

Returns: void* Pointer to the newly allocated memory

◆ allocate_sync()

void* rmm::mr::device_memory_resource::allocate_sync	(	std::size_t	bytes,
		std::size_t	alignment = `rmm::CUDA_ALLOCATION_ALIGNMENT`
	)

inline

Allocates memory of size at least bytes.

The returned pointer will have 256 byte alignment regardless of the value of alignment. Higher alignments must use the aligned_resource_adaptor.

Exceptions

rmm::bad_alloc When the requested bytes cannot be allocated.

Parameters

bytes	The size of the allocation
alignment	The alignment of the allocation (see notes above)

Returns: void* Pointer to the newly allocated memory

◆ deallocate()

void rmm::mr::device_memory_resource::deallocate	(	cuda_stream_view	stream,
		void *	ptr,
		std::size_t	bytes,
		[[maybe_unused] ] std::size_t	alignment = `rmm::CUDA_ALLOCATION_ALIGNMENT`
	)

inlinenoexcept

Deallocate memory pointed to by ptr on the specified stream.

Parameters

stream	The stream on which to perform the deallocation
ptr	Pointer to be deallocated
bytes	The size in bytes of the allocation. This must be equal to the value of `bytes` that was passed to the `allocate` call that returned `p`.
alignment	The alignment that was passed to the `allocate` call that returned `p`

◆ deallocate_sync()

void rmm::mr::device_memory_resource::deallocate_sync	(	void *	ptr,
		std::size_t	bytes,
		[[maybe_unused] ] std::size_t	alignment = `rmm::CUDA_ALLOCATION_ALIGNMENT`
	)

inlinenoexcept

Deallocate memory pointed to by p.

Parameters

ptr	Pointer to be deallocated
bytes	The size in bytes of the allocation. This must be equal to the value of `bytes` that was passed to the `allocate` call that returned `p`.
alignment	The alignment that was passed to the `allocate` call that returned `p`

◆ is_equal()

bool rmm::mr::device_memory_resource::is_equal ( device_memory_resource const & other ) const

inlinenoexcept

Compare this resource to another.

Two device_memory_resources compare equal if and only if memory allocated from one device_memory_resource can be deallocated from the other and vice versa.

By default, simply checks if *this and other refer to the same object, i.e., does not check if they are two objects of the same class.

Parameters

other The other resource to compare to

Returns: If the two resources are equivalent

◆ operator!=()

bool rmm::mr::device_memory_resource::operator!= ( device_memory_resource const & other ) const

inlinenoexcept

Comparison operator with another device_memory_resource.

Parameters

other The other resource to compare to

Returns: false If the two resources are equivalent; true If the two resources are not equivalent

◆ operator=() [1/2]

device_memory_resource& rmm::mr::device_memory_resource::operator= ( device_memory_resource && )

defaultnoexcept

Default move assignment operator.

Returns: device_memory_resource& Reference to the assigned object

◆ operator=() [2/2]

device_memory_resource& rmm::mr::device_memory_resource::operator= ( device_memory_resource const & )

default

Default copy assignment operator.

Returns: device_memory_resource& Reference to the assigned object

◆ operator==()

bool rmm::mr::device_memory_resource::operator== ( device_memory_resource const & other ) const

inlinenoexcept

Comparison operator with another device_memory_resource.

Parameters

other The other resource to compare to

Returns: true If the two resources are equivalent; false If the two resources are not equivalent

Friends And Related Function Documentation

◆ get_property

void get_property	(	device_memory_resource const &	,
		cuda::mr::device_accessible
	)

friend

Enables the cuda::mr::device_accessible property.

This property declares that a device_memory_resource provides device accessible memory

The documentation for this class was generated from the following file:

device_memory_resource.hpp

Public Member Functions

Friends

Detailed Description

Member Function Documentation

◆ allocate()

◆ allocate_sync()

◆ deallocate()

◆ deallocate_sync()

◆ is_equal()

◆ operator!=()

◆ operator=() [1/2]

◆ operator=() [2/2]

◆ operator==()

Friends And Related Function Documentation

◆ get_property