Files | Classes | Functions

Files

file  approx_distinct_count.hpp
 
file  distinct_count.hpp
 
file  unique_count.hpp
 

Classes

class  cudf::approx_distinct_count
 Object-oriented HyperLogLog sketch for approximate distinct counting. More...
 

Functions

cudf::size_type cudf::distinct_count (column_view const &input, null_policy null_handling, nan_policy nan_handling, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Count the distinct elements in the column_view. More...
 
cudf::size_type cudf::distinct_count (table_view const &input, null_equality nulls_equal=null_equality::EQUAL, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Count the distinct rows in a table. More...
 
cudf::size_type cudf::unique_count (column_view const &input, null_policy null_handling, nan_policy nan_handling, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Count the number of consecutive groups of equivalent rows in a column. More...
 
cudf::size_type cudf::unique_count (table_view const &input, null_equality nulls_equal=null_equality::EQUAL, rmm::cuda_stream_view stream=cudf::get_default_stream())
 Count the number of consecutive groups of equivalent rows in a table. More...
 

Detailed Description

Function Documentation

◆ distinct_count() [1/2]

cudf::size_type cudf::distinct_count ( column_view const &  input,
null_policy  null_handling,
nan_policy  nan_handling,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Count the distinct elements in the column_view.

If nulls_equal == nulls_equal::UNEQUAL, all nulls are distinct.

Given an input column_view, number of distinct elements in this column_view is returned.

If null_handling is null_policy::EXCLUDE and nan_handling is nan_policy::NAN_IS_NULL, both NaN and null values are ignored. If null_handling is null_policy::EXCLUDE and nan_handling is nan_policy::NAN_IS_VALID, only null is ignored, NaN is considered in distinct count.

nulls are handled as equal.

Parameters
[in]inputThe column_view whose distinct elements will be counted
[in]null_handlingflag to include or ignore null while counting
[in]nan_handlingflag to consider NaN==null or not
[in]streamCUDA stream used for device memory operations and kernel launches
Returns
number of distinct rows in the table

◆ distinct_count() [2/2]

cudf::size_type cudf::distinct_count ( table_view const &  input,
null_equality  nulls_equal = null_equality::EQUAL,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Count the distinct rows in a table.

Parameters
[in]inputTable whose distinct rows will be counted
[in]nulls_equalflag to denote if null elements should be considered equal. nulls are not equal if null_equality::UNEQUAL.
[in]streamCUDA stream used for device memory operations and kernel launches
Returns
number of distinct rows in the table

◆ unique_count() [1/2]

cudf::size_type cudf::unique_count ( column_view const &  input,
null_policy  null_handling,
nan_policy  nan_handling,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Count the number of consecutive groups of equivalent rows in a column.

If null_handling is null_policy::EXCLUDE and nan_handling is nan_policy::NAN_IS_NULL, both NaN and null values are ignored. If null_handling is null_policy::EXCLUDE and nan_handling is nan_policy::NAN_IS_VALID, only null is ignored, NaN is considered in count.

nulls are handled as equal.

Parameters
[in]inputThe column_view whose consecutive groups of equivalent rows will be counted
[in]null_handlingflag to include or ignore null while counting
[in]nan_handlingflag to consider NaN==null or not
[in]streamCUDA stream used for device memory operations and kernel launches
Returns
number of consecutive groups of equivalent rows in the column

◆ unique_count() [2/2]

cudf::size_type cudf::unique_count ( table_view const &  input,
null_equality  nulls_equal = null_equality::EQUAL,
rmm::cuda_stream_view  stream = cudf::get_default_stream() 
)

Count the number of consecutive groups of equivalent rows in a table.

Parameters
[in]inputTable whose consecutive groups of equivalent rows will be counted
[in]nulls_equalflag to denote if null elements should be considered equal nulls are not equal if null_equality::UNEQUAL.
[in]streamCUDA stream used for device memory operations and kernel launches
Returns
number of consecutive groups of equivalent rows in the column