cudf.Series.factorize#

Series.factorize(na_sentinel=- 1)#

Encode the input values as integer labels.

Parameters
na_sentinelnumber

Value to indicate missing category.

Returns
(labels, cats)(cupy.ndarray, cupy.ndarray or Index)
  • labels contains the encoded values

  • cats contains the categories in order that the N-th item corresponds to the (N-1) code.

Examples

>>> import cudf
>>> s = cudf.Series(['a', 'a', 'c'])
>>> codes, uniques = s.factorize()
>>> codes
array([0, 0, 1], dtype=int8)
>>> uniques
StringIndex(['a' 'c'], dtype='object')