Files | |
file | char_types.hpp |
file | char_types_enum.hpp |
Enumerations | |
enum | cudf::strings::string_character_types : uint32_t { cudf::strings::DECIMAL = 1 << 0 , cudf::strings::NUMERIC = 1 << 1 , cudf::strings::DIGIT = 1 << 2 , cudf::strings::ALPHA = 1 << 3 , cudf::strings::SPACE = 1 << 4 , cudf::strings::UPPER = 1 << 5 , cudf::strings::LOWER = 1 << 6 , cudf::strings::ALPHANUM = DECIMAL | NUMERIC | DIGIT | ALPHA , cudf::strings::CASE_TYPES = UPPER | LOWER , cudf::strings::ALL_TYPES = ALPHANUM | CASE_TYPES | SPACE } |
Character type values. These types can be or'd to check for any combination of types. More... | |
enum cudf::strings::string_character_types : uint32_t |
Character type values. These types can be or'd to check for any combination of types.
This cannot be turned into an enum class because or'd entries can result in values that are not in the class. For example, combining NUMERIC|SPACE is a valid, reasonable combination but does not match to any explicitly named enumerator.
Definition at line 38 of file char_types_enum.hpp.
std::unique_ptr<column> cudf::strings::all_characters_of_type | ( | strings_column_view const & | input, |
string_character_types | types, | ||
string_character_types | verify_types = string_character_types::ALL_TYPES , |
||
rmm::cuda_stream_view | stream = cudf::get_default_stream() , |
||
rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
) |
Returns a boolean column identifying string entries where all characters are of the type specified.
The output row entry will be set to false if the corresponding string element is empty or has at least one character not of the specified type. If all characters fit the type then true is set in that output row entry.
To ignore all but specific types, set the verify_types
to those types which should be checked. Otherwise, the default ALL_TYPES
will verify all characters match types
.
Any null row results in a null entry for that row in the output column.
input | Strings instance for this operation |
types | The character types to check in each string |
verify_types | Only verify against these character types. Default ALL_TYPES means return true iff all characters match types . |
stream | CUDA stream used for device memory operations and kernel launches |
mr | Device memory resource used to allocate the returned column's device memory |
std::unique_ptr<column> cudf::strings::filter_characters_of_type | ( | strings_column_view const & | input, |
string_character_types | types_to_remove, | ||
string_scalar const & | replacement = string_scalar("") , |
||
string_character_types | types_to_keep = string_character_types::ALL_TYPES , |
||
rmm::cuda_stream_view | stream = cudf::get_default_stream() , |
||
rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
) |
Filter specific character types from a column of strings.
To remove all characters of a specific type, set that type in types_to_remove
and set types_to_keep
to ALL_TYPES
.
To filter out characters NOT of a select type, specify ALL_TYPES
for types_to_remove
and which types to not remove in types_to_keep
.
In s1
all NUMERIC types have been removed. In s2
all non-LOWER types have been replaced.
One but not both parameters types_to_remove
and types_to_keep
must be set to ALL_TYPES
.
Any null row results in a null entry for that row in the output column.
cudf::logic_error | if neither or both types_to_remove and types_to_keep are set to ALL_TYPES . |
input | Strings instance for this operation |
types_to_remove | The character types to check in each string. Use ALL_TYPES here to specify types_to_keep instead. |
replacement | The replacement character to use when removing characters |
types_to_keep | Default ALL_TYPES means all characters of types_to_remove will be filtered. |
mr | Device memory resource used to allocate the returned column's device memory |
stream | CUDA stream used for device memory operations and kernel launches |
|
constexpr |
OR operator for combining string_character_types.
lhs | left-hand side of OR operation |
rhs | right-hand side of OR operation |
Definition at line 58 of file char_types_enum.hpp.
|
constexpr |
Compound assignment OR operator for combining string_character_types.
lhs | left-hand side of OR operation |
rhs | right-hand side of OR operation |
lhs
after combining lhs
and rhs
Definition at line 72 of file char_types_enum.hpp.