normalize#
- pylibcudf.nvtext.normalize.normalize_characters(Column input, bool do_lower_case) Column #
Normalizes strings characters for tokenizing.
For details, see
normalize_characters()
- Parameters:
- inputColumn
Input strings
- do_lower_casebool
If true, upper-case characters are converted to lower-case and accents are stripped from those characters. If false, accented and upper-case characters are not transformed.
- Returns:
- Column
Normalized strings column
- pylibcudf.nvtext.normalize.normalize_spaces(Column input) Column #
Returns a new strings column by normalizing the whitespace in each string in the input column.
For details, see
normalize_spaces()
- Parameters:
- inputColumn
Input strings
- Returns:
- Column
New strings columns of normalized strings.