Files | Functions
Extracting

Files

file  lists/extract.hpp
 

Functions

std::unique_ptr< columncudf::lists::extract_list_element (lists_column_view const &lists_column, size_type index, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=rmm::mr::get_current_device_resource())
 Create a column where each row is the element at position index from the corresponding sublist in the input lists_column. More...
 
std::unique_ptr< columncudf::lists::extract_list_element (lists_column_view const &lists_column, column_view const &indices, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=rmm::mr::get_current_device_resource())
 Create a column where each row is a single element from the corresponding sublist in the input lists_column, selected using indices from the indices column. More...
 

Detailed Description

Function Documentation

◆ extract_list_element() [1/2]

std::unique_ptr<column> cudf::lists::extract_list_element ( lists_column_view const &  lists_column,
column_view const &  indices,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)

Create a column where each row is a single element from the corresponding sublist in the input lists_column, selected using indices from the indices column.

Output column[i] is set from element lists_column[i][indices[i]]. If indices[i] is larger than the size of the sublist at lists_column[i] then output column[i] = null. Similarly, if indices[i] is null, then column[i] = null.

l = { {1, 2, 3}, {4}, {5, 6} }
r = extract_list_element(l, {0, null, 2})
r is now {1, null, null}

indices[i] may also be negative, in which case the row retrieved is offset from the end of each sublist.

l = { {"a"}, {"b", "c"}, {"d", "e", "f"} }
r = extract_list_element(l, {-1, -2, -4})
r is now {"a", "b", null}

Any input where lists_column[i] == null produces output column[i] = null. Any input where lists_column[i][indices[i]] == null produces output column[i] = null.

Parameters
lists_columnColumn to extract elements from.
indicesThe column whose rows indicate the element index to be retrieved from each list row.
streamCUDA stream used for device memory operations and kernel launches.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
Column of extracted elements.
Exceptions
cudf::logic_errorIf the sizes of lists_column and indices do not match.

◆ extract_list_element() [2/2]

std::unique_ptr<column> cudf::lists::extract_list_element ( lists_column_view const &  lists_column,
size_type  index,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)

Create a column where each row is the element at position index from the corresponding sublist in the input lists_column.

Output column[i] is set from element lists_column[i][index]. If index is larger than the size of the sublist at lists_column[i] then output column[i] = null.

l = { {1, 2, 3}, {4}, {5, 6} }
r = extract_list_element(l, 1)
r is now {2, null, 6}

The index may also be negative in which case the row retrieved is offset from the end of each sublist.

l = { {"a"}, {"b", "c"}, {"d", "e", "f"} }
r = extract_list_element(l, -1)
r is now {"a", "c", "f"}

Any input where lists_column[i] == null will produce output column[i] = null. Also, any element where lists_column[i][index] == null will produce output column[i] = null.

Parameters
lists_columnColumn to extract elements from.
indexThe row within each sublist to retrieve.
streamCUDA stream used for device memory operations and kernel launches.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
Column of extracted elements.