cudf.DataFrame.partition_by_hash#

DataFrame.partition_by_hash(columns: Sequence[Hashable], nparts: int, keep_index: bool = True) → list[Self][source]#

Partition the dataframe by the hashed value of data in columns.

Parameters:

columnssequence of str: The names of the columns to be hashed. Must have at least one name.
npartsint: Number of output partitions
keep_indexboolean: Whether to keep the index or drop it

Returns:

partitioned: list of DataFrame