cudf.core.tokenize_vocabulary.TokenizeVocabulary#

class cudf.core.tokenize_vocabulary.TokenizeVocabulary(vocabulary: Series)[source]#

A vocabulary object used to tokenize input text.

Parameters:
vocabularystr

Strings column of vocabulary terms

Methods

tokenize(text[, delimiter, default_id])