cudf.core.column.string.StringMethods.tokenize#
- StringMethods.tokenize(delimiter: str = ' ') SeriesOrIndex #
Each string is split into tokens using the provided delimiter(s). The sequence returned contains the tokens in the order they were found.
- Parameters:
- delimiterstr or list of strs, Default is whitespace.
The string used to locate the split points of each string.
- Returns:
- Series or Index of object.
Examples
>>> import cudf >>> data = ["hello world", "goodbye world", "hello goodbye"] >>> ser = cudf.Series(data) >>> ser.str.tokenize() 0 hello 0 world 1 goodbye 1 world 2 hello 2 goodbye dtype: object