Public Member Functions | List of all members
rmm::device_scalar< T > Class Template Reference

Container for a single object of type T in device memory. More...

#include <device_scalar.hpp>

Public Member Functions

 device_scalar (cuda_stream_view stream, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new uninitialized device_scalar. More...
 
 device_scalar (T const &initial_value, cuda_stream_view stream=cuda_stream_view{}, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new device_scalar with an initial value. More...
 
 device_scalar (device_scalar const &other, cuda_stream_view stream={}, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new device_scalar by deep copying the contents of another device_scalar, using the specified stream and memory resource. More...
 
value (cuda_stream_view stream=cuda_stream_view{}) const
 Copies the value from device to host, synchronizes, and returns the value. More...
 
template<typename U = T>
auto set_value (U const &host_value, cuda_stream_view stream=cuda_stream_view{}) -> std::enable_if_t< std::is_fundamental< U >::value &&not std::is_same< U, bool >::value, void >
 Sets the value of the device_scalar to the given host_value. More...
 
template<typename U = T>
auto set_value (U const &host_value, cuda_stream_view stream=cuda_stream_view{}) -> std::enable_if_t< std::is_same< U, bool >::value, void >
 Sets the value of the device_scalar to the given host_value. More...
 
template<typename U = T>
auto set_value (T const &host_value, cuda_stream_view stream=cuda_stream_view{}) -> std::enable_if_t< not std::is_fundamental< U >::value, void >
 Sets the value of the device_scalar to the given host_value. More...
 
void set_value (T &&host_value, cuda_stream_view stream=cuda_stream_view{})=delete
 
template<typename U = T>
auto set_value_zero (cuda_stream_view stream=cuda_stream_view{}) -> std::enable_if_t< std::is_fundamental< U >::value, void >
 Sets the value of the device_scalar to zero. More...
 
T * data () noexcept
 Returns pointer to object in device memory. More...
 
T const * data () const noexcept
 Returns const pointer to object in device memory. More...
 
 device_scalar (device_scalar &&)=default
 
device_scalaroperator= (device_scalar const &)=delete
 
device_scalaroperator= (device_scalar &&)=delete
 

Detailed Description

template<typename T>
class rmm::device_scalar< T >

Container for a single object of type T in device memory.

T must be trivially copyable.

Template Parameters
TThe object's type

Constructor & Destructor Documentation

◆ device_scalar() [1/3]

template<typename T >
rmm::device_scalar< T >::device_scalar ( cuda_stream_view  stream,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new uninitialized device_scalar.

Does not synchronize the stream.

Note
This device_scalar is only safe to access in kernels and copies on the specified CUDA stream, or on another stream only if a dependency is enforced (e.g. using cudaStreamWaitEvent()).
Exceptions
<tt>rmm::bad_alloc</tt>if allocating the device memory fails.
Parameters
streamStream on which to perform asynchronous allocation.
mrOptional, resource with which to allocate.

◆ device_scalar() [2/3]

template<typename T >
rmm::device_scalar< T >::device_scalar ( T const &  initial_value,
cuda_stream_view  stream = cuda_stream_view{},
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new device_scalar with an initial value.

Does not synchronize the stream.

Note
This device_scalar is only safe to access in kernels and copies on the specified CUDA stream, or on another stream only if a dependency is enforced (e.g. using cudaStreamWaitEvent()).
Exceptions
<tt>rmm::bad_alloc</tt>if allocating the device memory for initial_value fails.
<tt>rmm::cuda_error</tt>if copying initial_value to device memory fails.
Parameters
initial_valueThe initial value of the object in device memory.
streamOptional, stream on which to perform allocation and copy.
mrOptional, resource with which to allocate.

◆ device_scalar() [3/3]

template<typename T >
rmm::device_scalar< T >::device_scalar ( device_scalar< T > const &  other,
cuda_stream_view  stream = {},
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inline

Construct a new device_scalar by deep copying the contents of another device_scalar, using the specified stream and memory resource.

Exceptions
rmm::bad_allocIf creating the new allocation fails.
rmm::cuda_errorif copying from other fails.
Parameters
otherThe device_scalar whose contents will be copied
streamThe stream to use for the allocation and copy
mrThe resource to use for allocating the new device_scalar

Member Function Documentation

◆ data() [1/2]

template<typename T >
T const* rmm::device_scalar< T >::data ( ) const
inlinenoexcept

Returns const pointer to object in device memory.

Note
If the returned device pointer is used on a CUDA stream different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()), otherwise there may be a race condition.

◆ data() [2/2]

template<typename T >
T* rmm::device_scalar< T >::data ( )
inlinenoexcept

Returns pointer to object in device memory.

Note
If the returned device pointer is used on a CUDA stream different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()), otherwise there may be a race condition.

◆ set_value() [1/3]

template<typename T >
template<typename U = T>
auto rmm::device_scalar< T >::set_value ( T const &  host_value,
cuda_stream_view  stream = cuda_stream_view{} 
) -> std::enable_if_t<not std::is_fundamental<U>::value, void>
inline

Sets the value of the device_scalar to the given host_value.

Specialization for non-fundamental types.

Note
If the stream specified to this function is different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before and after calling this function, otherwise there may be a race condition.

This function does not synchronize stream before returning. Therefore, the object referenced by host_value should not be destroyed or modified until stream has been synchronized. Otherwise, behavior is undefined.

Note
: This function incurs a host to device memcpy and should be used sparingly.

Example:

my_type v{42, "text"};
// Copies 42 to device storage on `stream`. Does _not_ synchronize
vec.set_value(v, stream);
...
cudaStreamSynchronize(stream);
// Synchronization is required before `v` can be modified
v.value = 21;
Exceptions
<tt>rmm::cuda_error</tt>if copying host_value to device memory fails
<tt>rmm::cuda_error</tt>if synchronizing stream fails
Parameters
host_valueThe host value which will be copied to device
streamCUDA stream on which to perform the copy

◆ set_value() [2/3]

template<typename T >
template<typename U = T>
auto rmm::device_scalar< T >::set_value ( U const &  host_value,
cuda_stream_view  stream = cuda_stream_view{} 
) -> std::enable_if_t<std::is_fundamental<U>::value && not std::is_same<U, bool>::value, void>
inline

Sets the value of the device_scalar to the given host_value.

This specialization for fundamental types is optimized to use cudaMemsetAsync when host_value is zero.

Note
If the stream specified to this function is different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before and after calling this function, otherwise there may be a race condition.

This function does not synchronize stream before returning. Therefore, the object referenced by host_value should not be destroyed or modified until stream has been synchronized. Otherwise, behavior is undefined.

Note
: This function incurs a host to device memcpy or device memset and should be used sparingly.

Example:

int v{42};
// Copies 42 to device storage on `stream`. Does _not_ synchronize
vec.set_value(v, stream);
...
cudaStreamSynchronize(stream);
// Synchronization is required before `v` can be modified
v = 13;
Exceptions
<tt>rmm::cuda_error</tt>if copying host_value to device memory fails.
Parameters
host_valueThe host value which will be copied to device
streamCUDA stream on which to perform the copy

◆ set_value() [3/3]

template<typename T >
template<typename U = T>
auto rmm::device_scalar< T >::set_value ( U const &  host_value,
cuda_stream_view  stream = cuda_stream_view{} 
) -> std::enable_if_t<std::is_same<U, bool>::value, void>
inline

Sets the value of the device_scalar to the given host_value.

This specialization for bool is optimized to always use cudaMemsetAsync.

Note
If the stream specified to this function is different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before and after calling this function, otherwise there may be a race condition.

This function does not synchronize stream before returning. host_value is passed by value so a host-side copy may be performed before calling a device memset.

Note
: This function incurs a device memset.

Example:

bool v{true};
// Copies `true` to device storage on `stream`. Does _not_ synchronize
vec.set_value(v, stream);
...
cudaStreamSynchronize(stream);
// Synchronization is required before `v` can be modified
v = false;
Exceptions
<tt>rmm::cuda_error</tt>if the device memset fails.
Parameters
host_valueThe host value which the scalar will be set to (true or false)
streamCUDA stream on which to perform the device memset

◆ set_value_zero()

template<typename T >
template<typename U = T>
auto rmm::device_scalar< T >::set_value_zero ( cuda_stream_view  stream = cuda_stream_view{}) -> std::enable_if_t<std::is_fundamental<U>::value, void>
inline

Sets the value of the device_scalar to zero.

Only supported for fundamental types.

Note
If the stream specified to this function is different from the stream specified to the constructor, then appropriate dependencies must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before and after calling this function, otherwise there may be a race condition.

This function does not synchronize stream before returning.

Note
: This function incurs a device memset and should be used sparingly.
Exceptions
<tt>rmm::cuda_error</tt>if the device memset fails.
Parameters
streamCUDA stream on which to perform the device memset

◆ value()

template<typename T >
T rmm::device_scalar< T >::value ( cuda_stream_view  stream = cuda_stream_view{}) const
inline

Copies the value from device to host, synchronizes, and returns the value.

Synchronizes stream after copying the data from device to host.

Note
If the stream specified to this function is different from the stream specified to the constructor, then an appropriate dependency must be inserted between the streams (e.g. using cudaStreamWaitEvent() or cudaStreamSynchronize()) before calling this function, otherwise there may be a race condition.
Exceptions
<tt>rmm::cuda_error</tt>If the copy fails.
<tt>rmm::cuda_error</tt>If synchronizing stream fails.
Returns
T The value of the scalar.
Parameters
streamCUDA stream on which to perform the copy and synchronize.

The documentation for this class was generated from the following file:
rmm::device_scalar::set_value
auto set_value(U const &host_value, cuda_stream_view stream=cuda_stream_view{}) -> std::enable_if_t< std::is_fundamental< U >::value &&not std::is_same< U, bool >::value, void >
Sets the value of the device_scalar to the given host_value.
Definition: device_scalar.hpp:167
rmm::device_scalar
Container for a single object of type T in device memory.
Definition: device_scalar.hpp:36