Files | Functions
Finding

Files

file  find.hpp
 
file  find_multiple.hpp
 

Functions

std::unique_ptr< columncudf::strings::find (strings_column_view const &strings, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found in each string of the provided column. More...
 
std::unique_ptr< columncudf::strings::rfind (strings_column_view const &strings, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found searching from the end of each string. More...
 
std::unique_ptr< columncudf::strings::contains (strings_column_view const &strings, string_scalar const &target, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::contains (strings_column_view const &strings, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::starts_with (strings_column_view const &strings, string_scalar const &target, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::starts_with (strings_column_view const &strings, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::ends_with (strings_column_view const &strings, string_scalar const &target, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::ends_with (strings_column_view const &strings, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column. More...
 
std::unique_ptr< columncudf::strings::find_multiple (strings_column_view const &input, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a lists column with character position values where each of the target strings are found in each string. More...
 

Detailed Description

Function Documentation

◆ contains() [1/2]

std::unique_ptr<column> cudf::strings::contains ( strings_column_view const &  strings,
string_scalar const &  target,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column.

If the target is not found for a string, false is returned for that entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

Parameters
stringsStrings instance for this operation.
targetUTF-8 encoded string to search for in each string.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New type_id::BOOL8 column.

◆ contains() [2/2]

std::unique_ptr<column> cudf::strings::contains ( strings_column_view const &  strings,
strings_column_view const &  targets,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column.

The 'output[i] = trueif stringtargets[i]is found insidestrings[i]otherwise output[i] = false. Iftarget[i]is an empty string, true is returned foroutput[i]. Iftarget[i]is null, false is returned foroutput[i]`.

Any null strings[i] row results in a null output[i] row.

Exceptions
cudf::logic_errorif strings.size() != targets.size().
Parameters
stringsStrings instance for this operation.
targetsStrings column of targets to check row-wise in strings.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New type_id::BOOL8 column.

◆ ends_with() [1/2]

std::unique_ptr<column> cudf::strings::ends_with ( strings_column_view const &  strings,
string_scalar const &  target,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column.

If target is not found at the end of a string, false is set for that row entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

Parameters
stringsStrings instance for this operation.
targetUTF-8 encoded string to search for in each string.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New type_id::BOOL8 column.

◆ ends_with() [2/2]

std::unique_ptr<column> cudf::strings::ends_with ( strings_column_view const &  strings,
strings_column_view const &  targets,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column.

If targets[i] is not found at the end of a string in strings[i], false is set for that row entry in the output column. If targets[i] is an empty string, true is returned for the corresponding entry in the output column.

Any null string entries in targets return corresponding null entries in the output columns.

Exceptions
cudf::logic_errorif strings.size() != targets.size().
Parameters
stringsStrings instance for this operation.
targetsStrings instance for this operation.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New type_id::BOOL8 column.

◆ find()

std::unique_ptr<column> cudf::strings::find ( strings_column_view const &  strings,
string_scalar const &  target,
size_type  start = 0,
size_type  stop = -1,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of character position values where the target string is first found in each string of the provided column.

If target is not found, -1 is returned for that row entry in the output column.

The target string is searched within each string in the character position range [start,stop). If the stop parameter is -1, then the end of each string becomes the final position to include in the search.

Any null string entries return corresponding null output column entries.

Exceptions
cudf::logic_errorif start position is greater than stop position.
Parameters
stringsStrings instance for this operation.
targetUTF-8 encoded string to search for in each string.
startFirst character position to include in the search.
stopLast position (exclusive) to include in the search. Default of -1 will search to the end of the string.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New integer column with character position values.

◆ find_multiple()

std::unique_ptr<column> cudf::strings::find_multiple ( strings_column_view const &  input,
strings_column_view const &  targets,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a lists column with character position values where each of the target strings are found in each string.

The size of the output column is input.size(). Each row of the output column is of size targets.size().

output[i,j] contains the position of targets[j] in input[i]

Example:
s = ["abc", "def"]
t = ["a", "c", "e"]
r = find_multiple(s, t)
r is now {[ 0, 2,-1], // for "abc": "a" at pos 0, "c" at pos 2, "e" not found
[-1,-1, 1 ]} // for "def": "a" and "b" not found, "e" at pos 1
Exceptions
cudf::logic_errorif targets is empty or contains nulls
Parameters
inputStrings instance for this operation.
targetsStrings to search for in each string.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
Lists column with character position values.

◆ rfind()

std::unique_ptr<column> cudf::strings::rfind ( strings_column_view const &  strings,
string_scalar const &  target,
size_type  start = 0,
size_type  stop = -1,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of character position values where the target string is first found searching from the end of each string.

If target is not found, -1 is returned for that entry.

The target string is searched within each string in the character position range [start,stop). If the stop parameter is -1, then the end of each string becomes the final position to include in the search.

Any null string entries return corresponding null output column entries.

Exceptions
cudf::logic_errorif start position is greater than stop position.
Parameters
stringsStrings instance for this operation.
targetUTF-8 encoded string to search for in each string.
startFirst position to include in the search.
stopLast position (exclusive) to include in the search. Default of -1 will search starting at the end of the string.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New integer column with character position values.

◆ starts_with() [1/2]

std::unique_ptr<column> cudf::strings::starts_with ( strings_column_view const &  strings,
string_scalar const &  target,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column.

If target is not found at the beginning of a string, false is set for that row entry in the output column. If target is an empty string, true is returned for all non-null entries in the output column.

Any null string entries return corresponding null entries in the output columns.

Parameters
stringsStrings instance for this operation.
targetUTF-8 encoded string to search for in each string.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New type_id::BOOL8 column.

◆ starts_with() [2/2]

std::unique_ptr<column> cudf::strings::starts_with ( strings_column_view const &  strings,
strings_column_view const &  targets,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column.

If targets[i] is not found at the beginning of a string in strings[i], false is set for that row entry in the output column. If targets[i] is an empty string, true is returned for corresponding entry in the output column.

Any null string entries in targets return corresponding null entries in the output columns.

Exceptions
cudf::logic_errorif strings.size() != targets.size().
Parameters
stringsStrings instance for this operation.
targetsStrings instance for this operation.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New type_id::BOOL8 column.