Modules | Files | Functions
Strings

Modules

 Case
 
 Character Types
 
 Combining
 
 Searching
 
 Converting
 
 Copying
 
 Substring
 
 Finding
 
 Modifying
 
 Replacing
 
 Splitting
 
 JSON
 
 Regex
 

Files

file  attributes.hpp
 Read attributes of strings column.
 

Functions

std::unique_ptr< columncudf::strings::count_characters (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns an integer numeric column containing the length of each string in characters. More...
 
std::unique_ptr< columncudf::strings::count_bytes (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Returns a numeric column containing the length of each string in bytes. More...
 
std::unique_ptr< columncudf::strings::code_points (strings_column_view const &strings, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource())
 Creates a numeric column with code point values (integers) for each character of each string. More...
 

Detailed Description

Function Documentation

◆ code_points()

std::unique_ptr<column> cudf::strings::code_points ( strings_column_view const &  strings,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Creates a numeric column with code point values (integers) for each character of each string.

A code point is the integer value representation of a character. For example, the code point value for the character 'A' in UTF-8 is 65.

The size of the output column will be the total number of characters in the strings column.

Any null string is ignored. No null entries will appear in the output column.

Parameters
stringsStrings instance for this operation.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New INT32 column with code point integer values for each character.

◆ count_bytes()

std::unique_ptr<column> cudf::strings::count_bytes ( strings_column_view const &  strings,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns a numeric column containing the length of each string in bytes.

The output column will have the same number of rows as the specified strings column. Each row value will be the number of bytes in the corresponding string.

Any null string will result in a null entry for that row in the output column.

Parameters
stringsStrings instance for this operation.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New INT32 column with the number of bytes for each string.

◆ count_characters()

std::unique_ptr<column> cudf::strings::count_characters ( strings_column_view const &  strings,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
)

Returns an integer numeric column containing the length of each string in characters.

The output column will have the same number of rows as the specified strings column. Each row value will be the number of characters in the corresponding string.

Any null string will result in a null entry for that row in the output column.

Parameters
stringsStrings instance for this operation.
mrDevice memory resource used to allocate the returned column's device memory.
Returns
New INT32 column with lengths for each string.