Files | Functions
Slicing

Files

file  slice.hpp
 

Functions

std::unique_ptr< columncudf::strings::slice_strings (strings_column_view const &input, numeric_scalar< size_type > const &start=numeric_scalar< size_type >(0, false), numeric_scalar< size_type > const &stop=numeric_scalar< size_type >(0, false), numeric_scalar< size_type > const &step=numeric_scalar< size_type >(1), rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=rmm::mr::get_current_device_resource())
 Returns a new strings column that contains substrings of the strings in the provided column. More...
 
std::unique_ptr< columncudf::strings::slice_strings (strings_column_view const &input, column_view const &starts, column_view const &stops, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=rmm::mr::get_current_device_resource())
 Returns a new strings column that contains substrings of the strings in the provided column using unique ranges for each string. More...
 

Detailed Description

Function Documentation

◆ slice_strings() [1/2]

std::unique_ptr<column> cudf::strings::slice_strings ( strings_column_view const &  input,
column_view const &  starts,
column_view const &  stops,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)

Returns a new strings column that contains substrings of the strings in the provided column using unique ranges for each string.

The character positions to retrieve in each string are specified in the starts and stops integer columns. If a start position is outside a string's length, an empty string is returned for that entry. If a stop position is past the end of a string's length, the end of the string is used for stop position for that string. Any stop position value set to -1 will indicate to use the end of the string as the stop position for that string.

Null string entries will return null output string entries.

The starts and stops column must both be the same integer type and must be the same size as the strings column.

Example:
s = ["hello", "goodbye"]
starts = [ 1, 2 ]
stops = [ 5, 4 ]
r = slice_strings(s,starts,stops)
r is now ["ello","od"]
Exceptions
cudf::logic_errorif starts or stops is a different size than the strings column.
cudf::logic_errorif starts and stops are not same integer type.
cudf::logic_errorif starts or stops contains nulls.
Parameters
inputStrings column for this operation
startsFirst character positions to begin the substring
stopsLast character (exclusive) positions to end the substring
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New strings column with sorted elements of this instance

◆ slice_strings() [2/2]

std::unique_ptr<column> cudf::strings::slice_strings ( strings_column_view const &  input,
numeric_scalar< size_type > const &  start = numeric_scalarsize_type >(0, false),
numeric_scalar< size_type > const &  stop = numeric_scalarsize_type >(0, false),
numeric_scalar< size_type > const &  step = numeric_scalarsize_type >(1),
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = rmm::mr::get_current_device_resource() 
)

Returns a new strings column that contains substrings of the strings in the provided column.

The character positions to retrieve in each string are [start,stop). If the start position is outside a string's length, an empty string is returned for that entry. If the stop position is past the end of a string's length, the end of the string is used for stop position for that string.

Null string entries will return null output string entries.

Example:
s = ["hello", "goodbye"]
r = slice_strings(s,2,6)
r is now ["llo","odby"]
r2 = slice_strings(s,2,5,2)
r2 is now ["lo","ob"]
Parameters
inputStrings column for this operation
startFirst character position to begin the substring
stopLast character position (exclusive) to end the substring
stepDistance between input characters retrieved
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Returns
New strings column with sorted elements of this instance