Classes | Enumerations | Functions
cudf::strings Namespace Reference

Strings column APIs. More...

Classes

class  get_json_object_options
 Settings for get_json_object(). More...
 

Enumerations

enum  string_character_types : uint32_t {
  DECIMAL = 1 << 0, NUMERIC = 1 << 1, DIGIT = 1 << 2, ALPHA = 1 << 3,
  SPACE = 1 << 4, UPPER = 1 << 5, LOWER = 1 << 6, ALPHANUM = DECIMAL | NUMERIC | DIGIT | ALPHA,
  CASE_TYPES = UPPER | LOWER, ALL_TYPES = ALPHANUM | CASE_TYPES | SPACE
}
 Character type values. These types can be or'd to check for any combination of types. More...
 
enum  separator_on_nulls { separator_on_nulls::YES, separator_on_nulls::NO }
 Setting for specifying how separators are added with null strings elements. More...
 
enum  output_if_empty_list { output_if_empty_list::EMPTY_STRING, output_if_empty_list::NULL_ELEMENT }
 Setting for specifying what will be output from join_list_elements when an input list is empty. More...
 
enum  pad_side { pad_side::LEFT, pad_side::RIGHT, pad_side::BOTH }
 Pad types for the pad method specify where the pad character should be placed. More...
 
enum  regex_flags : uint32_t { DEFAULT = 0, MULTILINE = 8, DOTALL = 16 }
 Regex flags. More...
 
enum  strip_type { strip_type::LEFT, strip_type::RIGHT, strip_type::BOTH }
 Direction identifier for strip() function. More...
 
enum  filter_type : bool { filter_type::KEEP, filter_type::REMOVE }
 Removes or keeps the specified character ranges in cudf::strings::filter_characters. More...
 

Functions

std::unique_ptr< columncount_characters (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns an integer numeric column containing the length of each string in characters. More...
 
std::unique_ptr< columncount_bytes (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a numeric column containing the length of each string in bytes. More...
 
std::unique_ptr< columncode_points (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a numeric column with code point values (integers) for each character of each string. More...
 
std::unique_ptr< columncapitalize (strings_column_view const &input, string_scalar const &delimiters=string_scalar(""), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of capitalized strings. More...
 
std::unique_ptr< columntitle (strings_column_view const &input, string_character_types sequence_type=string_character_types::ALPHA, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Modifies first character of each word to upper-case and lower-cases the rest. More...
 
std::unique_ptr< columnis_title (strings_column_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Checks if the strings in the input column are title formatted. More...
 
std::unique_ptr< columnto_lower (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Converts a column of strings to lower case. More...
 
std::unique_ptr< columnto_upper (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Converts a column of strings to upper case. More...
 
std::unique_ptr< columnswapcase (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of strings converting lower case characters to upper case and vice versa. More...
 
string_character_types operator| (string_character_types lhs, string_character_types rhs)
 OR operator for combining string_character_types.
 
string_character_typesoperator|= (string_character_types &lhs, string_character_types rhs)
 Compound assignment OR operator for combining string_character_types.
 
std::unique_ptr< columnall_characters_of_type (strings_column_view const &strings, string_character_types types, string_character_types verify_types=string_character_types::ALL_TYPES, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings entries in which all characters are of the type specified. More...
 
std::unique_ptr< columnfilter_characters_of_type (strings_column_view const &strings, string_character_types types_to_remove, string_scalar const &replacement=string_scalar(""), string_character_types types_to_keep=string_character_types::ALL_TYPES, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Filter specific character types from a column of strings. More...
 
std::unique_ptr< columnjoin_strings (strings_column_view const &strings, string_scalar const &separator=string_scalar(""), string_scalar const &narep=string_scalar("", false), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Concatenates all strings in the column into one new string delimited by an optional separator string. More...
 
std::unique_ptr< columnconcatenate (table_view const &strings_columns, strings_column_view const &separators, string_scalar const &separator_narep=string_scalar("", false), string_scalar const &col_narep=string_scalar("", false), separator_on_nulls separate_nulls=separator_on_nulls::YES, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Concatenates a list of strings columns using separators for each row and returns the result as a strings column. More...
 
std::unique_ptr< columnconcatenate (table_view const &strings_columns, string_scalar const &separator=string_scalar(""), string_scalar const &narep=string_scalar("", false), separator_on_nulls separate_nulls=separator_on_nulls::YES, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Row-wise concatenates the given list of strings columns and returns a single strings column result. More...
 
std::unique_ptr< columnjoin_list_elements (const lists_column_view &lists_strings_column, const strings_column_view &separators, string_scalar const &separator_narep=string_scalar("", false), string_scalar const &string_narep=string_scalar("", false), separator_on_nulls separate_nulls=separator_on_nulls::YES, output_if_empty_list empty_list_policy=output_if_empty_list::EMPTY_STRING, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Given a lists column of strings (each row is a list of strings), concatenates the strings within each row and returns a single strings column result. More...
 
std::unique_ptr< columnjoin_list_elements (const lists_column_view &lists_strings_column, string_scalar const &separator=string_scalar(""), string_scalar const &narep=string_scalar("", false), separator_on_nulls separate_nulls=separator_on_nulls::YES, output_if_empty_list empty_list_policy=output_if_empty_list::EMPTY_STRING, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Given a lists column of strings (each row is a list of strings), concatenates the strings within each row and returns a single strings column result. More...
 
std::unique_ptr< columncontains_re (strings_column_view const &strings, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying rows which match the given regex pattern. More...
 
std::unique_ptr< columnmatches_re (strings_column_view const &strings, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying rows which matching the given regex pattern but only at the beginning the string. More...
 
std::unique_ptr< columncount_re (strings_column_view const &strings, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns the number of times the given regex pattern matches in each string. More...
 
std::unique_ptr< columnto_booleans (strings_column_view const &strings, string_scalar const &true_string=string_scalar("true"), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new BOOL8 column by parsing boolean values from the strings in the provided strings column. More...
 
std::unique_ptr< columnfrom_booleans (column_view const &booleans, string_scalar const &true_string=string_scalar("true"), string_scalar const &false_string=string_scalar("false"), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting the boolean values from the provided column into strings. More...
 
std::unique_ptr< columnto_timestamps (strings_column_view const &strings, data_type timestamp_type, std::string const &format, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new timestamp column converting a strings column into timestamps using the provided format pattern. More...
 
std::unique_ptr< columnis_timestamp (strings_column_view const &strings, std::string const &format, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Verifies the given strings column can be parsed to timestamps using the provided format pattern. More...
 
std::unique_ptr< columnfrom_timestamps (column_view const &timestamps, std::string const &format="%Y-%m-%dT%H:%M:%SZ", strings_column_view const &names=strings_column_view(column_view{ data_type{type_id::STRING}, 0, nullptr}), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting a timestamp column into strings using the provided format pattern. More...
 
std::unique_ptr< columnto_durations (strings_column_view const &strings, data_type duration_type, std::string const &format, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new duration column converting a strings column into durations using the provided format pattern. More...
 
std::unique_ptr< columnfrom_durations (column_view const &durations, std::string const &format="%D days %H:%M:%S", rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting a duration column into strings using the provided format pattern. More...
 
std::unique_ptr< columnto_fixed_point (strings_column_view const &input, data_type output_type, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new fixed-point column parsing decimal values from the provided strings column. More...
 
std::unique_ptr< columnfrom_fixed_point (column_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting the fixed-point values into a strings column. More...
 
std::unique_ptr< columnis_fixed_point (strings_column_view const &input, data_type decimal_type=data_type{type_id::DECIMAL64}, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings in which all characters are valid for conversion to fixed-point. More...
 
std::unique_ptr< columnto_floats (strings_column_view const &strings, data_type output_type, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new numeric column by parsing float values from each string in the provided strings column. More...
 
std::unique_ptr< columnfrom_floats (column_view const &floats, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting the float values from the provided column into strings. More...
 
std::unique_ptr< columnis_float (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings in which all characters are valid for conversion to floats. More...
 
std::unique_ptr< columnto_integers (strings_column_view const &strings, data_type output_type, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new integer numeric column parsing integer values from the provided strings column. More...
 
std::unique_ptr< columnfrom_integers (column_view const &integers, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting the integer values from the provided column into strings. More...
 
std::unique_ptr< columnis_integer (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings in which all characters are valid for conversion to integers. More...
 
std::unique_ptr< columnis_integer (strings_column_view const &strings, data_type int_type, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings in which all characters are valid for conversion to integers. More...
 
std::unique_ptr< columnhex_to_integers (strings_column_view const &strings, data_type output_type, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new integer numeric column parsing hexadecimal values from the provided strings column. More...
 
std::unique_ptr< columnis_hex (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings in which all characters are valid for conversion to integers from hex. More...
 
std::unique_ptr< columnintegers_to_hex (column_view const &input, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column converting integer columns to hexadecimal characters. More...
 
std::unique_ptr< columnipv4_to_integers (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Converts IPv4 addresses into integers. More...
 
std::unique_ptr< columnintegers_to_ipv4 (column_view const &integers, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Converts integers into IPv4 addresses as strings. More...
 
std::unique_ptr< columnis_ipv4 (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a boolean column identifying strings in which all characters are valid for conversion to integers from IPv4 format. More...
 
std::unique_ptr< columnformat_list_column (lists_column_view const &input, string_scalar const &na_rep=string_scalar("NULL"), strings_column_view const &separators=strings_column_view(column_view{ data_type{type_id::STRING}, 0, nullptr}), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Convert a list column of strings into a formatted strings column. More...
 
std::unique_ptr< columnurl_encode (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Decodes each string using URL encoding. More...
 
std::unique_ptr< columnurl_decode (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Encodes each string using URL encoding. More...
 
std::unique_ptr< tableextract (strings_column_view const &strings, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a table of strings columns where each column corresponds to the matching group specified in the given regular expression pattern. More...
 
std::unique_ptr< columnextract_all_record (strings_column_view const &strings, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a lists column of strings where each string column row corresponds to the matching group specified in the given regular expression pattern. More...
 
std::unique_ptr< columnfind (strings_column_view const &strings, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found in each string of the provided column. More...
 
std::unique_ptr< columnrfind (strings_column_view const &strings, string_scalar const &target, size_type start=0, size_type stop=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of character position values where the target string is first found searching from the end of each string. More...
 
std::unique_ptr< columncontains (strings_column_view const &strings, string_scalar const &target, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found within that string in the provided column. More...
 
std::unique_ptr< columncontains (strings_column_view const &strings, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the corresponding target string was found within that string in the provided column. More...
 
std::unique_ptr< columnstarts_with (strings_column_view const &strings, string_scalar const &target, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found at the beginning of that string in the provided column. More...
 
std::unique_ptr< columnstarts_with (strings_column_view const &strings, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the beginning of that string in the provided column. More...
 
std::unique_ptr< columnends_with (strings_column_view const &strings, string_scalar const &target, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates the target string was found at the end of that string in the provided column. More...
 
std::unique_ptr< columnends_with (strings_column_view const &strings, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a column of boolean values for each string where true indicates corresponding string in target column was found at the end of that string in the provided column. More...
 
std::unique_ptr< columnfind_multiple (strings_column_view const &input, strings_column_view const &targets, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a lists column with character position values where each of the target strings are found in each string. More...
 
std::unique_ptr< tablefindall (strings_column_view const &input, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a table of strings columns for each matching occurrence of the regex pattern within each string. More...
 
std::unique_ptr< columnfindall_record (strings_column_view const &input, std::string const &pattern, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a lists column of strings for each matching occurrence of the regex pattern within each string. More...
 
std::unique_ptr< cudf::columnget_json_object (cudf::strings_column_view const &col, cudf::string_scalar const &json_path, get_json_object_options options=get_json_object_options{}, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Apply a JSONPath string to all rows in an input strings column. More...
 
std::unique_ptr< columnpad (strings_column_view const &strings, size_type width, pad_side side=cudf::strings::pad_side::RIGHT, std::string const &fill_char=" ", rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Add padding to each string using a provided character. More...
 
std::unique_ptr< columnzfill (strings_column_view const &strings, size_type width, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Add '0' as padding to the left of each string. More...
 
constexpr bool is_multiline (regex_flags const f)
 Returns true if the given flags contain MULTILINE. More...
 
constexpr bool is_dotall (regex_flags const f)
 Returns true if the given flags contain DOTALL. More...
 
std::unique_ptr< string_scalarrepeat_string (string_scalar const &input, size_type repeat_times, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Repeat the given string scalar by a given number of times. More...
 
std::unique_ptr< columnrepeat_strings (strings_column_view const &input, size_type repeat_times, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Repeat each string in the given strings column by a given number of times. More...
 
std::unique_ptr< columnrepeat_strings (strings_column_view const &input, column_view const &repeat_times, std::optional< column_view > output_strings_sizes=std::nullopt, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Repeat each string in the given strings column by the numbers of times given in another numeric column. More...
 
std::pair< std::unique_ptr< column >, int64_t > repeat_strings_output_sizes (strings_column_view const &input, column_view const &repeat_times, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Compute sizes of the output strings if each string in the input strings column is repeated by the numbers of times given in another numeric column. More...
 
std::unique_ptr< columnreplace (strings_column_view const &strings, string_scalar const &target, string_scalar const &repl, int32_t maxrepl=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Replaces target string within each string with the specified replacement string. More...
 
std::unique_ptr< columnreplace_slice (strings_column_view const &strings, string_scalar const &repl=string_scalar(""), size_type start=0, size_type stop=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 This function replaces each string in the column with the provided repl string within the [start,stop) character position range. More...
 
std::unique_ptr< columnreplace (strings_column_view const &strings, strings_column_view const &targets, strings_column_view const &repls, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Replaces substrings matching a list of targets with the corresponding replacement strings. More...
 
std::unique_ptr< columnreplace_re (strings_column_view const &strings, std::string const &pattern, string_scalar const &replacement=string_scalar(""), std::optional< size_type > max_replace_count=std::nullopt, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 For each string, replaces any character sequence matching the given pattern with the provided replacement string. More...
 
std::unique_ptr< columnreplace_re (strings_column_view const &strings, std::vector< std::string > const &patterns, strings_column_view const &replacements, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 For each string, replaces any character sequence matching the given patterns with the corresponding string in the replacements column. More...
 
std::unique_ptr< columnreplace_with_backrefs (strings_column_view const &strings, std::string const &pattern, std::string const &replacement, regex_flags const flags=regex_flags::DEFAULT, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 For each string, replaces any character sequence matching the given pattern using the replacement template for back-references. More...
 
std::unique_ptr< tablepartition (strings_column_view const &strings, string_scalar const &delimiter=string_scalar(""), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a set of 3 columns by splitting each string using the specified delimiter. More...
 
std::unique_ptr< tablerpartition (strings_column_view const &strings, string_scalar const &delimiter=string_scalar(""), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a set of 3 columns by splitting each string using the specified delimiter starting from the end of each string. More...
 
std::unique_ptr< tablesplit (strings_column_view const &strings_column, string_scalar const &delimiter=string_scalar(""), size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a list of columns by splitting each string using the specified delimiter. More...
 
std::unique_ptr< tablersplit (strings_column_view const &strings_column, string_scalar const &delimiter=string_scalar(""), size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a list of columns by splitting each string using the specified delimiter starting from the end of each string. More...
 
std::unique_ptr< columnsplit_record (strings_column_view const &strings, string_scalar const &delimiter=string_scalar(""), size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Splits individual strings elements into a list of strings. More...
 
std::unique_ptr< columnrsplit_record (strings_column_view const &strings, string_scalar const &delimiter=string_scalar(""), size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Splits individual strings elements into a list of strings starting from the end of each string. More...
 
std::unique_ptr< tablesplit_re (strings_column_view const &input, std::string const &pattern, size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Splits strings elements into a table of strings columns using a regex pattern to delimit each string. More...
 
std::unique_ptr< tablersplit_re (strings_column_view const &input, std::string const &pattern, size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Splits strings elements into a table of strings columns using a regex pattern to delimit each string starting from the end of the string. More...
 
std::unique_ptr< columnsplit_record_re (strings_column_view const &input, std::string const &pattern, size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Splits strings elements into a list column of strings using the given regex pattern to delimit each string. More...
 
std::unique_ptr< columnrsplit_record_re (strings_column_view const &input, std::string const &pattern, size_type maxsplit=-1, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Splits strings elements into a list column of strings using the given regex pattern to delimit each string starting from the end of the string. More...
 
std::unique_ptr< columnstrip (strings_column_view const &strings, strip_type stype=strip_type::BOTH, string_scalar const &to_strip=string_scalar(""), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Removes the specified characters from the beginning or end (or both) of each string. More...
 
std::unique_ptr< columnslice_strings (strings_column_view const &strings, numeric_scalar< size_type > const &start=numeric_scalar< size_type >(0, false), numeric_scalar< size_type > const &stop=numeric_scalar< size_type >(0, false), numeric_scalar< size_type > const &step=numeric_scalar< size_type >(1), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column that contains substrings of the strings in the provided column. More...
 
std::unique_ptr< columnslice_strings (strings_column_view const &strings, column_view const &starts, column_view const &stops, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a new strings column that contains substrings of the strings in the provided column using unique ranges for each string. More...
 
std::unique_ptr< columnslice_strings (strings_column_view const &strings, string_scalar const &delimiter, size_type count, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Slices a column of strings by using a delimiter as a slice point. More...
 
std::unique_ptr< columnslice_strings (strings_column_view const &strings, strings_column_view const &delimiter_strings, size_type count, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Slices a column of strings by using a delimiter column as slice points. More...
 
std::unique_ptr< columntranslate (strings_column_view const &strings, std::vector< std::pair< char_utf8, char_utf8 >> const &chars_table, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Translates individual characters within each string. More...
 
std::unique_ptr< columnfilter_characters (strings_column_view const &strings, std::vector< std::pair< cudf::char_utf8, cudf::char_utf8 >> characters_to_filter, filter_type keep_characters=filter_type::KEEP, string_scalar const &replacement=string_scalar(""), rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Removes ranges of characters from each string in a strings column. More...
 
std::unique_ptr< columnwrap (strings_column_view const &strings, size_type width, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Wraps strings onto multiple lines shorter than width by replacing appropriate white space with new-line characters (ASCII 0x0A). More...
 

Detailed Description

Strings column APIs.