reduce#
- pylibcudf.reduce.distinct_count(Column source, null_policy null_handling, nan_policy nan_handling, Stream stream=None) size_type#
Returns the number of distinct elements in the input column.
For details, see
cudf::distinct_count().- Parameters:
- sourceColumn
The input column to count the unique elements of.
- null_handlingnull_policy
Flag to include or exclude nulls from the count.
- nan_handlingnan_policy
Flag to include or exclude NaNs from the count.
- Returns:
- size_type
The number of distinct elements in the input column.
- pylibcudf.reduce.is_valid_reduce_aggregation(DataType source, Aggregation agg) bool#
Return if an aggregation is supported for a given datatype.
- Parameters:
- source
The type of the column the aggregation is being performed on.
- agg
The aggregation.
- Returns:
- True if the aggregation is supported.
- pylibcudf.reduce.minmax(Column col, Stream stream=None, DeviceMemoryResource mr=None) tuple#
Compute the minimum and maximum of a column
For details, see
cudf::minmaxdocumentation.- Parameters:
- colColumn
The column to compute the minimum and maximum of.
- streamStream | None
CUDA stream on which to perform the operation.
- mrDeviceMemoryResource | None
Device memory resource used to allocate the returned scalars’ device memory.
- Returns:
- tuple
A tuple of two Scalars, the first being the minimum and the second being the maximum.
- pylibcudf.reduce.reduce(Column col, Aggregation agg, DataType data_type, Scalar init=None, Stream stream=None, DeviceMemoryResource mr=None) Scalar#
Perform a reduction on a column
For details, see
cudf::reducedocumentation.- Parameters:
- colColumn
The column to perform the reduction on.
- aggAggregation
The aggregation to perform.
- data_typeDataType
The data type of the result.
- initScalar | None
The initial value for the reduction.
- streamStream | None
CUDA stream on which to perform the operation.
- mrDeviceMemoryResource | None
Device memory resource used to allocate the returned scalar’s device memory.
- Returns:
- Scalar
The result of the reduction.
- pylibcudf.reduce.scan(Column col, Aggregation agg, scan_type inclusive, Stream stream=None, DeviceMemoryResource mr=None) Column#
Perform a scan on a column
For details, see
cudf::scandocumentation.- Parameters:
- colColumn
The column to perform the scan on.
- aggAggregation
The aggregation to perform.
- inclusivescan_type
The type of scan to perform.
- streamStream | None
CUDA stream on which to perform the operation.
- mrDeviceMemoryResource | None
Device memory resource used to allocate the returned column’s device memory.
- Returns:
- Column
The result of the scan.
- pylibcudf.reduce.unique_count(Column source, null_policy null_handling, nan_policy nan_handling, Stream stream=None) size_type#
Returns the number of unique consecutive elements in the input column.
For details, see
cudf::unique_count().- Parameters:
- sourceColumn
The input column to count the unique elements of.
- null_handlingnull_policy
Flag to include or exclude nulls from the count.
- nan_handlingnan_policy
Flag to include or exclude NaNs from the count.
- Returns:
- size_type
The number of unique consecutive elements in the input column.
Notes
If the input column is sorted, then unique_count can produce the same result as distinct_count, but faster.