Public Types | Public Member Functions | List of all members
rmm::device_uvector< T > Class Template Reference

An uninitialized vector of elements in device memory. More...

#include <device_uvector.hpp>

Public Types

using value_type = T
 
using size_type = std::size_t
 
using reference = value_type &
 
using const_reference = value_type const &
 
using pointer = value_type *
 
using const_pointer = value_type const *
 
using iterator = pointer
 
using const_iterator = const_pointer
 

Public Member Functions

RMM_EXEC_CHECK_DISABLE device_uvector (device_uvector &&)=default
 
device_uvectoroperator= (device_uvector &&)=default
 
 device_uvector (device_uvector const &)=delete
 Copy ctor is deleted as it doesn't allow a stream argument.
 
device_uvectoroperator= (device_uvector const &)=delete
 Copy assignment is deleted as it doesn't allow a stream argument.
 
 device_uvector ()=delete
 Default constructor is deleted as it doesn't allow a stream argument.
 
 device_uvector (std::size_t size, cuda_stream_view stream, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new device_uvector with sufficient uninitialized storage for size elements. More...
 
 device_uvector (device_uvector const &other, cuda_stream_view stream, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Construct a new device_uvector by deep copying the contents of another device_uvector. More...
 
pointer element_ptr (std::size_t element_index) noexcept
 Returns pointer to the specified element. More...
 
const_pointer element_ptr (std::size_t element_index) const noexcept
 Returns pointer to the specified element. More...
 
void set_element (std::size_t element_index, T const &v, cuda_stream_view s)
 Performs a synchronous copy of v to the specified element in device memory. More...
 
void set_element_async (std::size_t element_index, value_type const &v, cuda_stream_view s)
 Performs an asynchronous copy of v to the specified element in device memory. More...
 
void set_element_async (std::size_t element_index, value_type const &&v, cuda_stream_view s)=delete
 
value_type element (std::size_t element_index, cuda_stream_view s) const
 Returns the specified element from device memory. More...
 
value_type front_element (cuda_stream_view s) const
 Returns the first element. More...
 
value_type back_element (cuda_stream_view s) const
 Returns the last element. More...
 
void resize (std::size_t new_size, cuda_stream_view stream)
 Resizes the vector to contain new_size elements. More...
 
void shrink_to_fit (cuda_stream_view stream)
 Forces deallocation of unused device memory. More...
 
device_buffer release () noexcept
 Release ownership of device memory storage. More...
 
std::size_t capacity () const noexcept
 Returns the number of elements that can be held in currently allocated storage. More...
 
pointer data () noexcept
 Returns pointer to underlying device storage. More...
 
const_pointer data () const noexcept
 Returns const pointer to underlying device storage. More...
 
iterator begin () noexcept
 Returns an iterator to the first element. More...
 
const_iterator cbegin () const noexcept
 Returns a const_iterator to the first element. More...
 
const_iterator begin () const noexcept
 Returns a const_iterator to the first element. More...
 
iterator end () noexcept
 Returns an iterator to the element following the last element of the vector. More...
 
const_iterator cend () const noexcept
 Returns a const_iterator to the element following the last element of the vector. More...
 
const_iterator end () const noexcept
 Returns an iterator to the element following the last element of the vector. More...
 
std::size_t size () const noexcept
 Returns the number of elements in the vector. More...
 
bool is_empty () const noexcept
 Returns true if the vector contains no elements, i.e., size() == 0. More...
 
mr::device_memory_resourcememory_resource () const noexcept
 Returns pointer to the resource used to allocate and deallocate the device storage. More...
 

Detailed Description

template<typename T>
class rmm::device_uvector< T >

An uninitialized vector of elements in device memory.

Similar to a thrust::device_vector, device_uvector is a random access container of elements stored contiguously in device memory. However, unlike thrust::device_vector, device_uvector does not default initialize the vector elements.

If initialization is desired, this must be done explicitly by the caller, e.g., with thrust::uninitialized_fill.

Example:

rmm::mr::device_memory_resource * mr = new my_custom_resource();
// Allocates *uninitialized* device memory on stream `s` sufficient for 100 ints using the
// supplied resource `mr`
rmm::device_uvector<int> uv(100, s, mr);
// Initializes all elements to 0 on stream `s`
thrust::uninitialized_fill(thrust::cuda::par.on(s), uv.begin(), uv.end(), 0);

Avoiding default initialization improves performance by eliminating the kernel launch required to default initialize the elements. This initialization is often unnecessary, e.g., when the vector is created to hold some output from some operation.

However, this restricts the element type T to only trivially copyable types. In short, trivially copyable types can be safely copied with memcpy. For more information, see https://en.cppreference.com/w/cpp/types/is_trivially_copyable.

Another key difference over thrust::device_vector is that all operations that invoke allocation, kernels, or memcpys take a CUDA stream parameter to indicate on which stream the operation will be performed.

Template Parameters
TTrivially copyable element type

Constructor & Destructor Documentation

◆ device_uvector() [1/2]

template<typename T >
rmm::device_uvector< T >::device_uvector ( std::size_t  size,
cuda_stream_view  stream,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new device_uvector with sufficient uninitialized storage for size elements.

Elements are uninitialized. Reading an element before it is initialized results in undefined behavior.

Parameters
sizeThe number of elements to allocate storage for
streamThe stream on which to perform the allocation
mrThe resource used to allocate the device storage

◆ device_uvector() [2/2]

template<typename T >
rmm::device_uvector< T >::device_uvector ( device_uvector< T > const &  other,
cuda_stream_view  stream,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)
inlineexplicit

Construct a new device_uvector by deep copying the contents of another device_uvector.

Elements are copied as if by memcpy, i.e., T's copy constructor is not invoked.

Parameters
otherThe vector to copy from
streamThe stream on which to perform the copy
mrThe resource used to allocate device memory for the new vector

Member Function Documentation

◆ back_element()

template<typename T >
value_type rmm::device_uvector< T >::back_element ( cuda_stream_view  s) const
inline

Returns the last element.

Note
This function incurs a device-to-host memcpy and should be used sparingly.
This function synchronizes stream.
Exceptions
rmm::out_of_rangeexception if the vector is empty.
Parameters
sThe stream on which to perform the copy
Returns
The value of the last element

◆ begin() [1/2]

template<typename T >
const_iterator rmm::device_uvector< T >::begin ( ) const
inlinenoexcept

Returns a const_iterator to the first element.

If the vector is empty, then begin() == end().

Returns
Immutable iterator to the first element.

◆ begin() [2/2]

template<typename T >
iterator rmm::device_uvector< T >::begin ( )
inlinenoexcept

Returns an iterator to the first element.

If the vector is empty, then begin() == end().

Returns
Iterator to the first element.

◆ capacity()

template<typename T >
std::size_t rmm::device_uvector< T >::capacity ( ) const
inlinenoexcept

Returns the number of elements that can be held in currently allocated storage.

Returns
std::size_t The number of elements that can be stored without requiring a new allocation.

◆ cbegin()

template<typename T >
const_iterator rmm::device_uvector< T >::cbegin ( ) const
inlinenoexcept

Returns a const_iterator to the first element.

If the vector is empty, then cbegin() == cend().

Returns
Immutable iterator to the first element.

◆ cend()

template<typename T >
const_iterator rmm::device_uvector< T >::cend ( ) const
inlinenoexcept

Returns a const_iterator to the element following the last element of the vector.

The element referenced by end() is a placeholder and dereferencing it results in undefined behavior.

Returns
Immutable iterator to one past the last element.

◆ data() [1/2]

template<typename T >
const_pointer rmm::device_uvector< T >::data ( ) const
inlinenoexcept

Returns const pointer to underlying device storage.

Note
If size() == 0 it is undefined behavior to deference the returned pointer. Furthermore, the returned pointer may or may not be equal to nullptr.
Returns
const_pointer Raw const pointer to element storage in device memory.

◆ data() [2/2]

template<typename T >
pointer rmm::device_uvector< T >::data ( )
inlinenoexcept

Returns pointer to underlying device storage.

Note
If size() == 0 it is undefined behavior to deference the returned pointer. Furthermore, the returned pointer may or may not be equal to nullptr.
Returns
Raw pointer to element storage in device memory.

◆ element()

template<typename T >
value_type rmm::device_uvector< T >::element ( std::size_t  element_index,
cuda_stream_view  s 
) const
inline

Returns the specified element from device memory.

Note
This function incurs a device to host memcpy and should be used sparingly.
This function synchronizes stream.
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the desired element
sThe stream on which to perform the copy
Returns
The value of the specified element

◆ element_ptr() [1/2]

template<typename T >
const_pointer rmm::device_uvector< T >::element_ptr ( std::size_t  element_index) const
inlinenoexcept

Returns pointer to the specified element.

Behavior is undefined if element_index >= size().

Parameters
element_indexIndex of the specified element.
Returns
T* Pointer to the desired element

◆ element_ptr() [2/2]

template<typename T >
pointer rmm::device_uvector< T >::element_ptr ( std::size_t  element_index)
inlinenoexcept

Returns pointer to the specified element.

Behavior is undefined if element_index >= size().

Parameters
element_indexIndex of the specified element.
Returns
T* Pointer to the desired element

◆ end() [1/2]

template<typename T >
const_iterator rmm::device_uvector< T >::end ( ) const
inlinenoexcept

Returns an iterator to the element following the last element of the vector.

The element referenced by end() is a placeholder and dereferencing it results in undefined behavior.

Returns
Immutable iterator to one past the last element.

◆ end() [2/2]

template<typename T >
iterator rmm::device_uvector< T >::end ( )
inlinenoexcept

Returns an iterator to the element following the last element of the vector.

The element referenced by end() is a placeholder and dereferencing it results in undefined behavior.

Returns
Iterator to one past the last element.

◆ front_element()

template<typename T >
value_type rmm::device_uvector< T >::front_element ( cuda_stream_view  s) const
inline

Returns the first element.

Note
This function incurs a device-to-host memcpy and should be used sparingly.
This function synchronizes stream.
Exceptions
rmm::out_of_rangeexception if the vector is empty.
Parameters
sThe stream on which to perform the copy
Returns
The value of the first element

◆ is_empty()

template<typename T >
bool rmm::device_uvector< T >::is_empty ( ) const
inlinenoexcept

Returns true if the vector contains no elements, i.e., size() == 0.

Returns
true The vector is empty
false The vector is not empty

◆ memory_resource()

template<typename T >
mr::device_memory_resource* rmm::device_uvector< T >::memory_resource ( ) const
inlinenoexcept

Returns pointer to the resource used to allocate and deallocate the device storage.

Returns
Pointer to underlying resource

◆ release()

template<typename T >
device_buffer rmm::device_uvector< T >::release ( )
inlinenoexcept

Release ownership of device memory storage.

Returns
The device_buffer used to store the vector elements

◆ resize()

template<typename T >
void rmm::device_uvector< T >::resize ( std::size_t  new_size,
cuda_stream_view  stream 
)
inline

Resizes the vector to contain new_size elements.

If new_size > size(), the additional elements are uninitialized.

If new_size < capacity(), no action is taken other than updating the value of size(). No memory is allocated nor copied. shrink_to_fit() may be used to force deallocation of unused memory.

If new_size > capacity(), elements are copied as if by mempcy to a new allocation.

The invariant size() <= capacity() holds.

Parameters
new_sizeThe desired number of elements
streamThe stream on which to perform the allocation/copy (if any)

◆ set_element()

template<typename T >
void rmm::device_uvector< T >::set_element ( std::size_t  element_index,
T const &  v,
cuda_stream_view  s 
)
inline

Performs a synchronous copy of v to the specified element in device memory.

Because this function synchronizes the stream s, it is safe to destroy or modify the object referenced by v after this function has returned.

Note
This function incurs a host to device memcpy and should be used sparingly.
This function synchronizes stream.

Example:

int v{42};
// Copies 42 to element 0 on `stream` and synchronizes the stream
vec.set_element(0, v, stream);
// It is safe to destroy or modify `v`
v = 13;
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the target element
vThe value to copy to the specified element
sThe stream on which to perform the copy

◆ set_element_async()

template<typename T >
void rmm::device_uvector< T >::set_element_async ( std::size_t  element_index,
value_type const &  v,
cuda_stream_view  s 
)
inline

Performs an asynchronous copy of v to the specified element in device memory.

This function does not synchronize stream s before returning. Therefore, the object referenced by v should not be destroyed or modified until stream has been synchronized. Otherwise, behavior is undefined.

Note
This function incurs a host to device memcpy and should be used sparingly.
Calling this function with a literal or other r-value reference for v is disallowed to prevent the implementation from asynchronously copying from a literal or other implicit temporary after it is deleted or goes out of scope.

Example:

int v{42};
// Copies 42 to element 0 on `stream`. Does _not_ synchronize
vec.set_element_async(0, v, stream);
...
cudaStreamSynchronize(stream);
// Synchronization is required before `v` can be modified
v = 13;
Exceptions
rmm::out_of_rangeexception if element_index >= size()
Parameters
element_indexIndex of the target element
vThe value to copy to the specified element
sThe stream on which to perform the copy

◆ shrink_to_fit()

template<typename T >
void rmm::device_uvector< T >::shrink_to_fit ( cuda_stream_view  stream)
inline

Forces deallocation of unused device memory.

If capacity() > size(), reallocates and copies vector contents to eliminate unused memory.

Parameters
streamStream on which to perform allocation and copy

◆ size()

template<typename T >
std::size_t rmm::device_uvector< T >::size ( ) const
inlinenoexcept

Returns the number of elements in the vector.

Returns
The number of elements.

The documentation for this class was generated from the following file:
rmm::cuda_stream_view
Strongly-typed non-owning wrapper for CUDA streams with default constructor.
Definition: cuda_stream_view.hpp:34
rmm::device_uvector
An uninitialized vector of elements in device memory.
Definition: device_uvector.hpp:68
rmm::mr::device_memory_resource
Base class for all libcudf device memory allocation.
Definition: device_memory_resource.hpp:83