cudf.core.accessors.string.StringMethods.character_tokenize#

StringMethods.character_tokenize() Series | Index[source]#

Each string is split into individual characters. The sequence returned contains each character as an individual string.

Returns:
Series or Index of object.

Examples

>>> import cudf
>>> data = ["hello world", "goodbye, thank you."]
>>> ser = cudf.Series(data)
>>> ser.str.character_tokenize()
0    h
0    e
0    l
0    l
0    o
0
0    w
0    o
0    r
0    l
0    d
1    g
1    o
1    o
1    d
1    b
1    y
1    e
1    ,
1
1    t
1    h
1    a
1    n
1    k
1
1    y
1    o
1    u
1    .
dtype: object