All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
Files | Functions


file  find.hpp
file  find_multiple.hpp


std::unique_ptr< columncudf::strings::find (strings_column_view const &input, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of character position values where the target string is first found in each string of the provided column. More...
std::unique_ptr< columncudf::strings::rfind (strings_column_view const &input, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of character position values where the target string is first found searching from the end of each string. More...
std::unique_ptr< columncudf::strings::find (strings_column_view const &input, strings_column_view const &target, size_type start=0, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of character position values where the target string is first found in the corresponding string of the provided column. More...
std::unique_ptr< columncudf::strings::contains (strings_column_view const &input, string_scalar const &target, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column. More...
std::unique_ptr< columncudf::strings::contains (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column. More...
std::unique_ptr< columncudf::strings::starts_with (strings_column_view const &input, string_scalar const &target, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column. More...
std::unique_ptr< columncudf::strings::starts_with (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column. More...
std::unique_ptr< columncudf::strings::ends_with (strings_column_view const &input, string_scalar const &target, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column. More...
std::unique_ptr< columncudf::strings::ends_with (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column. More...
std::unique_ptr< tablecudf::strings::contains_multiple (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Searches for the given target strings within each string in the provided column. More...
std::unique_ptr< columncudf::strings::find_multiple (strings_column_view const &input, strings_column_view const &targets, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref())
 Searches for the given target strings within each string in the provided column and returns the position the targets were found. More...

Detailed Description

Function Documentation

◆ contains() [1/2]

std::unique_ptr<column> cudf::strings::contains ( strings_column_view const &  input,
string_scalar const &  target,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column.

If the target is not found for a string, false is returned for that entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New BOOL8 column

◆ contains() [2/2]

std::unique_ptr<column> cudf::strings::contains ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column.

The 'output[i] = trueif stringtargets[i]is found insideinput[i]otherwise output[i] = false. Iftarget[i]is an empty string, true is returned foroutput[i]. Iftarget[i]is null, false is returned foroutput[i]`.

Any null string entries return corresponding null entries in the output columns.

cudf::logic_errorif strings.size() != targets.size().
inputStrings instance for this operation
targetsStrings column of targets to check row-wise in strings
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New BOOL8 column

◆ contains_multiple()

std::unique_ptr<table> cudf::strings::contains_multiple ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 

Searches for the given target strings within each string in the provided column.

Each column in the result table corresponds to the result for the target string at the same ordinal. i.e. 0th column is the BOOL8 column result for the 0th target string, 1st for 1st, etc.

If the target is not found for a string, false is returned for that entry in the output column. If the target is an empty string, true is returned for all non-null entries in the output column.

Any null input strings return corresponding null entries in the output columns.

input = ["a", "b", "c"]
targets = ["a", "c"]
output is a table with two boolean columns:
column 0: [true, false, false]
column 1: [false, false, true]
std::invalid_argumentif targets is empty or contains nulls
inputStrings instance for this operation
targetsUTF-8 encoded strings to search for in each string in input
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Table of BOOL8 columns

◆ ends_with() [1/2]

std::unique_ptr<column> cudf::strings::ends_with ( strings_column_view const &  input,
string_scalar const &  target,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column.

If target is not found at the end of a string, false is set for that row entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New BOOL8 column

◆ ends_with() [2/2]

std::unique_ptr<column> cudf::strings::ends_with ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column.

If targets[i] is not found at the end of a string in strings[i], false is set for that row entry in the output column. If targets[i] is an empty string, true is returned for the corresponding entry in the output column.

Any null string entries in targets return corresponding null entries in the output columns.

cudf::logic_errorif strings.size() != targets.size().
inputStrings instance for this operation
targetsStrings instance for this operation
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New BOOL8 column

◆ find() [1/2]

std::unique_ptr<column> cudf::strings::find ( strings_column_view const &  input,
string_scalar const &  target,
size_type  start = 0,
size_type  stop = -1,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of character position values where the target string is first found in each string of the provided column.

If target is not found, -1 is returned for that row entry in the output column.

The target string is searched within each string in the character position range [start,stop). If the stop parameter is -1, then the end of each string becomes the final position to include in the search.

Any null string entries return corresponding null output column entries.

cudf::logic_errorif start position is greater than stop position.
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
startFirst character position to include in the search
stopLast position (exclusive) to include in the search. Default of -1 will search to the end of the string.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New integer column with character position values

◆ find() [2/2]

std::unique_ptr<column> cudf::strings::find ( strings_column_view const &  input,
strings_column_view const &  target,
size_type  start = 0,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of character position values where the target string is first found in the corresponding string of the provided column.

The output of row i is the character position of the target string for row i within input string of row i starting at the character position start. If the target is not found within the input string, -1 is returned for that row entry in the output column.

Any null input or target entries return corresponding null output column entries.

cudf::logic_errorif input.size() != target.size()
inputStrings to search against
targetStrings to search for in input
startFirst character position to include in the search
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New integer column with character position values

◆ find_multiple()

std::unique_ptr<column> cudf::strings::find_multiple ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Searches for the given target strings within each string in the provided column and returns the position the targets were found.

The size of the output column is input.size(). Each row of the output column is of size targets.size().

output[i,j] contains the position of targets[j] in input[i]

s = ["abc", "def"]
t = ["a", "c", "e"]
r = find_multiple(s, t)
r is now {[ 0, 2,-1], // for "abc": "a" at pos 0, "c" at pos 2, "e" not found
[-1,-1, 1 ]} // for "def": "a" and "b" not found, "e" at pos 1
std::invalid_argumentif targets is empty or contains nulls
inputStrings instance for this operation
targetsStrings to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
Lists column with character position values

◆ rfind()

std::unique_ptr<column> cudf::strings::rfind ( strings_column_view const &  input,
string_scalar const &  target,
size_type  start = 0,
size_type  stop = -1,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of character position values where the target string is first found searching from the end of each string.

If target is not found, -1 is returned for that entry.

The target string is searched within each string in the character position range [start,stop). If the stop parameter is -1, then the end of each string becomes the final position to include in the search.

Any null string entries return corresponding null output column entries.

cudf::logic_errorif start position is greater than stop position.
inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
startFirst position to include in the search
stopLast position (exclusive) to include in the search. Default of -1 will search starting at the end of the string.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New integer column with character position values

◆ starts_with() [1/2]

std::unique_ptr<column> cudf::strings::starts_with ( strings_column_view const &  input,
string_scalar const &  target,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column.

If target is not found at the beginning of a string, false is set for that row entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

inputStrings instance for this operation
targetUTF-8 encoded string to search for in each string
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New type_id::BOOL8 column.

◆ starts_with() [2/2]

std::unique_ptr<column> cudf::strings::starts_with ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::cuda_stream_view  stream = cudf::get_default_stream(),
rmm::device_async_resource_ref  mr = cudf::get_current_device_resource_ref() 

Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column.

If targets[i] is not found at the beginning of a string in strings[i], false is set for that row entry in the output column. If targets[i] is an empty string, true is returned for corresponding entry in the output column.

Any null string entries in targets return corresponding null entries in the output columns.

cudf::logic_errorif strings.size() != targets.size().
inputStrings instance for this operation
targetsStrings instance for this operation
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned column's device memory
New BOOL8 column