cudf.DataFrame.partition_by_hash#

DataFrame.partition_by_hash(columns, nparts, keep_index=True)[source]#

Partition the dataframe by the hashed value of data in columns.

Parameters:
columnssequence of str

The names of the columns to be hashed. Must have at least one name.

npartsint

Number of output partitions

keep_indexboolean

Whether to keep the index or drop it

Returns:
partitioned: list of DataFrame