cudf.Index.factorize#
- Index.factorize(sort: bool = False, use_na_sentinel: bool = True) tuple[cupy.ndarray, cudf.Index] [source]#
Encode the input values as integer labels.
- Parameters:
- sortbool, default True
Sort uniques and shuffle codes to maintain the relationship.
- use_na_sentinelbool, default True
If True, the sentinel -1 will be used for NA values. If False, NA values will be encoded as non-negative integers and will not drop the NA from the uniques of the values.
- Returns:
- (labels, cats)(cupy.ndarray, cupy.ndarray or Index)
labels contains the encoded values
cats contains the categories in order that the N-th item corresponds to the (N-1) code.
Examples
>>> import cudf >>> s = cudf.Series(['a', 'a', 'c']) >>> codes, uniques = s.factorize() >>> codes array([0, 0, 1], dtype=int8) >>> uniques Index(['a', 'c'], dtype='object')