replace#

pylibcudf.nvtext.replace.filter_tokens(Column input, size_type min_token_length, Scalar replacement=None, Scalar delimiter=None) Column#

Removes tokens whose lengths are less than a specified number of characters.

For details, see filter_tokens()

Parameters:
inputColumn

Strings column to replace

min_token_lengthsize_type

The minimum number of characters to retain a token in the output string

replacementScalar, optional

Optional replacement string to be used in place of removed tokens

delimiterScalar, optional

Characters used to separate each string into tokens. The default of empty string will identify tokens using whitespace.

Returns
——-
Column

New strings column of filtered strings

pylibcudf.nvtext.replace.replace_tokens(Column input, Column targets, Column replacements, Scalar delimiter=None) Column#

Replaces specified tokens with corresponding replacement strings.

For details, see replace_tokens()

Parameters:
inputColumn

Strings column to replace

targetsColumn

Strings to compare against tokens found in input

replacementsColumn

Replacement strings for each string in targets

delimiterScalar, optional

Characters used to separate each string into tokens. The default of empty string will identify tokens using whitespace.

Returns:
Column

New strings column with replaced strings