partitioning#
- pylibcudf.partitioning.hash_partition(Table input, list columns_to_hash, int num_partitions) tuple #
Partitions rows from the input table into multiple output tables.
For details, see
hash_partition()
.- Parameters:
- inputTable
The table to partition
- columns_to_hashlist[int]
Indices of input columns to hash
- num_partitionsint
The number of partitions to use
- Returns:
- tuple[Table, list[int]]
An output table and a vector of row offsets to each partition
- pylibcudf.partitioning.partition(Table t, Column partition_map, int num_partitions) tuple #
Partitions rows of t according to the mapping specified by partition_map.
For details, see
partition()
.- Parameters:
- tTable
The table to partition
- partition_mapColumn
Non-nullable column of integer values that map each row in t to it’s partition.
- num_partitionsint
The total number of partitions
- Returns:
- tuple[Table, list[int]]
An output table and a list of row offsets to each partition
- pylibcudf.partitioning.round_robin_partition(Table input, int num_partitions, int start_partition=0) tuple #
Round-robin partition.
For details, see
round_robin_partition()
.- Parameters:
- inputTable
The input table to be round-robin partitioned
- num_partitionsint
Number of partitions for the table
- start_partitionint, default 0
Index of the 1st partition
- Returns:
- tuple[Table, list[int]]
The partitioned table and the partition offsets for each partition within the table.