cudf.core.column.string.StringMethods.rsplit#
- StringMethods.rsplit(pat: str | None = None, n: int = -1, expand: bool = False, regex: bool | None = None) SeriesOrIndex #
Split strings around given separator/delimiter.
Splits the string in the Series/Index from the end, at the specified delimiter string. Similar to str.rsplit().
- Parameters:
- patstr, default ‘ ‘ (space)
String to split on, does not yet support regular expressions.
- nint, default -1 (all)
Limit number of splits in output. None, 0, and -1 will all be interpreted as “all splits”.
- expandbool, default False
Expand the split strings into separate columns.
If
True
, return DataFrame/MultiIndex expanding dimensionality.If
False
, return Series/Index, containing lists of strings.
- regexbool, default None
Determines if the passed-in pattern is a regular expression:
If
True
, assumes the passed-in pattern is a regular expressionIf
False
, treats the pattern as a literal string.If pat length is 1, treats pat as a literal string.
- Returns:
- Series, Index, DataFrame or MultiIndex
Type matches caller unless
expand=True
(see Notes).
See also
split
Split strings around given separator/delimiter.
str.split
Standard library version for split.
str.rsplit
Standard library version for rsplit.
Notes
The handling of the n keyword depends on the number of found splits:
If found splits > n, make first n splits only
If found splits <= n, make all splits
If for a certain row the number of found splits < n, append None for padding up to n if
expand=True
.
If using
expand=True
, Series and Index callers return DataFrame and MultiIndex objects, respectively.Examples
>>> import cudf >>> s = cudf.Series( ... [ ... "this is a regular sentence", ... "https://docs.python.org/3/tutorial/index.html", ... None ... ] ... ) >>> s 0 this is a regular sentence 1 https://docs.python.org/3/tutorial/index.html 2 <NA> dtype: object
In the default setting, the string is split by whitespace.
>>> s.str.rsplit() 0 [this, is, a, regular, sentence] 1 [https://docs.python.org/3/tutorial/index.html] 2 None dtype: list
Without the
n
parameter, the outputs ofrsplit
andsplit
are identical.>>> s.str.split() 0 [this, is, a, regular, sentence] 1 [https://docs.python.org/3/tutorial/index.html] 2 None dtype: list
The n parameter can be used to limit the number of splits on the delimiter. The outputs of split and rsplit are different.
>>> s.str.rsplit(n=2) 0 [this is a, regular, sentence] 1 [https://docs.python.org/3/tutorial/index.html] 2 None dtype: list >>> s.str.split(n=2) 0 [this, is, a regular sentence] 1 [https://docs.python.org/3/tutorial/index.html] 2 None dtype: list
When using
expand=True
, the split elements will expand out into separate columns. If<NA>
value is present, it is propagated throughout the columns during the split.>>> s.str.rsplit(n=2, expand=True) 0 1 2 0 this is a regular sentence 1 https://docs.python.org/3/tutorial/index.html <NA> <NA> 2 <NA> <NA> <NA>
For slightly more complex use cases like splitting the html document name from a url, a combination of parameter settings can be used.
>>> s.str.rsplit("/", n=1, expand=True) 0 1 0 this is a regular sentence <NA> 1 https://docs.python.org/3/tutorial index.html 2 <NA> <NA>