aggregation#

class pylibcudf.aggregation.Aggregation#

A type of aggregation to perform.

Aggregations are passed to APIs like aggregate() to indicate what operations to perform. Using a class for aggregations provides a unified API for handling parametrizable aggregations. This class should never be instantiated directly, only via one of the factory functions.

For details, see cudf::aggregation.

Methods

kind(self)

Get the kind of the aggregation.

kind(self)#: Get the kind of the aggregation.

pylibcudf.aggregation.CorrelationType#

See also cudf::udf_type.

Enum members

CUDA
PTX

pylibcudf.aggregation.all() → Aggregation#

Create an all aggregation.

For details, see make_all_aggregation().

Returns:

Aggregation: The all aggregation.

pylibcudf.aggregation.any() → Aggregation#

Create an any aggregation.

For details, see make_any_aggregation().

Returns:

Aggregation: The any aggregation.

pylibcudf.aggregation.argmax() → Aggregation#

Create an argmax aggregation.

For details, see make_argmax_aggregation().

Returns:

Aggregation: The argmax aggregation.

pylibcudf.aggregation.argmin() → Aggregation#

Create an argmin aggregation.

For details, see make_argmin_aggregation().

Returns:

Aggregation: The argmin aggregation.

pylibcudf.aggregation.collect_list(null_policy null_handling=null_policy.INCLUDE) → Aggregation#

Create a collect_list aggregation.

For details, see make_collect_list_aggregation().

Parameters:

null_handlingnull_policy, default INCLUDE: Whether or not nulls should be included.

Returns:

Aggregation: The collect_list aggregation.

pylibcudf.aggregation.collect_set(null_handling=null_policy.INCLUDE, nulls_equal=null_equality.EQUAL, nans_equal=nan_equality.ALL_EQUAL) → Aggregation#

Create a collect_set aggregation.

For details, see make_collect_set_aggregation().

Parameters:

null_handlingnull_policy, default INCLUDE: Whether or not nulls should be included.
nulls_equalnull_equality, default EQUAL: Whether or not nulls should be considered equal.
nans_equalnan_equality, default ALL_EQUAL: Whether or not NaNs should be considered equal.

Returns:

Aggregation: The collect_set aggregation.

pylibcudf.aggregation.correlation(correlation_type type, size_type min_periods) → Aggregation#

Create a correlation aggregation.

For details, see make_correlation_aggregation().

Parameters:

typecorrelation_type: The type of correlation to compute.
min_periodsint: The minimum number of observations to consider for computing the correlation.

Returns:

Aggregation: The correlation aggregation.

pylibcudf.aggregation.count(null_policy null_handling=null_policy.EXCLUDE) → Aggregation#

Create a count aggregation.

For details, see make_count_aggregation().

Parameters:

null_handlingnull_policy, default EXCLUDE: Whether or not nulls should be included.

Returns:

Aggregation: The count aggregation.

pylibcudf.aggregation.covariance(size_type min_periods, size_type ddof) → Aggregation#

Create a covariance aggregation.

For details, see make_covariance_aggregation().

Parameters:

min_periodsint: The minimum number of observations to consider for computing the covariance.
ddofint: Delta degrees of freedom.

Returns:

Aggregation: The covariance aggregation.

pylibcudf.aggregation.ewma(float center_of_mass, ewm_history history) → Aggregation#

Create a EWMA aggregation.

For details, see make_ewma_aggregation().

Parameters:

center_of_massfloat: The decay in terms of the center of mass
historyewm_history: Whether or not to treat the history as infinite.

Returns:

Aggregation: The EWMA aggregation.

pylibcudf.aggregation.max() → Aggregation#

Create a max aggregation.

For details, see make_max_aggregation().

Returns:

Aggregation: The max aggregation.

pylibcudf.aggregation.mean() → Aggregation#

Create a mean aggregation.

For details, see make_mean_aggregation().

Returns:

Aggregation: The mean aggregation.

pylibcudf.aggregation.median() → Aggregation#

Create a median aggregation.

For details, see make_median_aggregation().

Returns:

Aggregation: The median aggregation.

pylibcudf.aggregation.min() → Aggregation#

Create a min aggregation.

For details, see make_min_aggregation().

Returns:

Aggregation: The min aggregation.

pylibcudf.aggregation.nth_element(size_type n, null_policy null_handling=null_policy.INCLUDE) → Aggregation#

Create a nth_element aggregation.

For details, see make_nth_element_aggregation().

Parameters:

null_handlingnull_policy, default INCLUDE: Whether or not nulls should be included.

Returns:

Aggregation: The nth_element aggregation.

pylibcudf.aggregation.nunique(null_policy null_handling=null_policy.EXCLUDE) → Aggregation#

Create a nunique aggregation.

For details, see make_nunique_aggregation().

Parameters:

null_handlingnull_policy, default EXCLUDE: Whether or not nulls should be included.

Returns:

Aggregation: The nunique aggregation.

pylibcudf.aggregation.product() → Aggregation#

Create a product aggregation.

For details, see make_product_aggregation().

Returns:

Aggregation: The product aggregation.

pylibcudf.aggregation.quantile(list quantiles, interpolation interp=interpolation.LINEAR) → Aggregation#

Create a quantile aggregation.

For details, see make_quantile_aggregation().

Parameters:

quantileslist: List of quantiles to compute, should be between 0 and 1.
interpinterpolation, default LINEAR: Interpolation technique to use when the desired quantile lies between two data points.

Returns:

Aggregation: The quantile aggregation.

pylibcudf.aggregation.rank(rank_method method, order column_order=order.ASCENDING, null_policy null_handling=null_policy.EXCLUDE, null_order null_precedence=null_order.AFTER, rank_percentage percentage=rank_percentage.NONE) → Aggregation#

Create a rank aggregation.

For details, see make_rank_aggregation().

Parameters:

methodrank_method: The method to use for ranking.
column_orderorder, default ASCENDING: The order in which to sort the column.
null_handlingnull_policy, default EXCLUDE: Whether or not nulls should be included.
null_precedencenull_order, default AFTER: Whether nulls should come before or after non-nulls.
percentagerank_percentage, default NONE: Whether or not ranks should be converted to percentages, and if so, the type of normalization to use.

Returns:

Aggregation: The rank aggregation.

pylibcudf.aggregation.std(size_type ddof=1) → Aggregation#

Create a std aggregation.

For details, see make_std_aggregation().

Parameters:

ddofint, default 1: Delta degrees of freedom. The default value is 1.

Returns:

Aggregation: The std aggregation.

pylibcudf.aggregation.sum() → Aggregation#

Create a sum aggregation.

For details, see make_sum_aggregation().

Returns:

Aggregation: The sum aggregation.

pylibcudf.aggregation.sum_of_squares() → Aggregation#

Create a sum_of_squares aggregation.

For details, see make_sum_of_squares_aggregation().

Returns:

Aggregation: The sum_of_squares aggregation.

pylibcudf.aggregation.udf(unicode operation, DataType output_type) → Aggregation#

Create a udf aggregation.

For details, see make_udf_aggregation().

Parameters:

operationstr: The operation to perform as a string of PTX code.
output_typeDataType: The output type of the aggregation.

Returns:

Aggregation: The udf aggregation.

pylibcudf.aggregation.variance(size_type ddof=1) → Aggregation#

Create a variance aggregation.

For details, see make_variance_aggregation().

Parameters:

ddofint, default 1: Delta degrees of freedom.

Returns:

Aggregation: The variance aggregation.