partitioning#

pylibcudf.partitioning.hash_partition(Table input, list columns_to_hash, int num_partitions) tuple#

Partitions rows from the input table into multiple output tables.

For details, see hash_partition().

Parameters:
inputTable

The table to partition

columns_to_hashlist[int]

Indices of input columns to hash

num_partitionsint

The number of partitions to use

Returns:
tuple[Table, list[int]]

An output table and a vector of row offsets to each partition

pylibcudf.partitioning.partition(Table t, Column partition_map, int num_partitions) tuple#

Partitions rows of t according to the mapping specified by partition_map.

For details, see partition().

Parameters:
tTable

The table to partition

partition_mapColumn

Non-nullable column of integer values that map each row in t to it’s partition.

num_partitionsint

The total number of partitions

Returns:
tuple[Table, list[int]]

An output table and a list of row offsets to each partition

pylibcudf.partitioning.round_robin_partition(Table input, int num_partitions, int start_partition=0) tuple#

Round-robin partition.

For details, see round_robin_partition().

Parameters:
inputTable

The input table to be round-robin partitioned

num_partitionsint

Number of partitions for the table

start_partitionint, default 0

Index of the 1st partition

Returns:
tuple[Table, list[int]]

The partitioned table and the partition offsets for each partition within the table.