cudf.core.column.string.StringMethods.character_tokenize#

StringMethods.character_tokenize() SeriesOrIndex#

Each string is split into individual characters. The sequence returned contains each character as an individual string.

Returns:
Series or Index of object.

Examples

>>> import cudf
>>> data = ["hello world", None, "goodbye, thank you."]
>>> ser = cudf.Series(data)
>>> ser.str.character_tokenize()
0    h
0    e
0    l
0    l
0    o
0
0    w
0    o
0    r
0    l
0    d
2    g
2    o
2    o
2    d
2    b
2    y
2    e
2    ,
2
2    t
2    h
2    a
2    n
2    k
2
2    y
2    o
2    u
2    .
dtype: object