Public Member Functions | List of all members
cudf::hash_join Class Reference

Hash join that builds hash table in creation and probes results in subsequent *_join member functions. More...

#include <join.hpp>

Public Member Functions

 hash_join (hash_join const &)=delete
 
 hash_join (hash_join &&)=delete
 
hash_joinoperator= (hash_join const &)=delete
 
hash_joinoperator= (hash_join &&)=delete
 
 hash_join (cudf::table_view const &build, null_equality compare_nulls, rmm::cuda_stream_view stream=rmm::cuda_stream_default)
 Construct a hash join object for subsequent probe calls. More...
 
std::pair< std::unique_ptr< rmm::device_uvector< size_type > >, std::unique_ptr< rmm::device_uvector< size_type > > > inner_join (cudf::table_view const &probe, std::optional< std::size_t > output_size={}, rmm::cuda_stream_view stream=rmm::cuda_stream_default, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource()) const
 
std::pair< std::unique_ptr< rmm::device_uvector< size_type > >, std::unique_ptr< rmm::device_uvector< size_type > > > left_join (cudf::table_view const &probe, std::optional< std::size_t > output_size={}, rmm::cuda_stream_view stream=rmm::cuda_stream_default, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource()) const
 
std::pair< std::unique_ptr< rmm::device_uvector< size_type > >, std::unique_ptr< rmm::device_uvector< size_type > > > full_join (cudf::table_view const &probe, std::optional< std::size_t > output_size={}, rmm::cuda_stream_view stream=rmm::cuda_stream_default, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource()) const
 
std::size_t inner_join_size (cudf::table_view const &probe, rmm::cuda_stream_view stream=rmm::cuda_stream_default) const
 
std::size_t left_join_size (cudf::table_view const &probe, rmm::cuda_stream_view stream=rmm::cuda_stream_default) const
 
std::size_t full_join_size (cudf::table_view const &probe, rmm::cuda_stream_view stream=rmm::cuda_stream_default, rmm::mr::device_memory_resource *mr=rmm::mr::get_current_device_resource()) const
 

Detailed Description

Hash join that builds hash table in creation and probes results in subsequent *_join member functions.

This class enables the hash join scheme that builds hash table once, and probes as many times as needed (possibly in parallel).

Definition at line 504 of file join.hpp.

Constructor & Destructor Documentation

◆ hash_join()

cudf::hash_join::hash_join ( cudf::table_view const &  build,
null_equality  compare_nulls,
rmm::cuda_stream_view  stream = rmm::cuda_stream_default 
)

Construct a hash join object for subsequent probe calls.

Note
The hash_join object must not outlive the table viewed by build, else behavior is undefined.
Parameters
buildThe build table, from which the hash table is built.
compare_nullsControls whether null join-key values should match or not.
streamCUDA stream used for device memory operations and kernel launches

Member Function Documentation

◆ full_join()

std::pair<std::unique_ptr<rmm::device_uvector<size_type> >, std::unique_ptr<rmm::device_uvector<size_type> > > cudf::hash_join::full_join ( cudf::table_view const &  probe,
std::optional< std::size_t >  output_size = {},
rmm::cuda_stream_view  stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
) const

Returns the row indices that can be used to construct the result of performing a full join between two tables.

See also
cudf::full_join(). Behavior is undefined if the provided output_size is smaller than the actual output size.
Parameters
probeThe probe table, from which the tuples are probed.
output_sizeOptional value which allows users to specify the exact output size.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table and columns' device memory.
Returns
A pair of columns [left_indices, right_indices] that can be used to construct the result of performing a full join between two tables with build and probe as the the join keys .

◆ full_join_size()

std::size_t cudf::hash_join::full_join_size ( cudf::table_view const &  probe,
rmm::cuda_stream_view  stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
) const

Returns the exact number of matches (rows) when performing a full join with the specified probe table.

Parameters
probeThe probe table, from which the tuples are probed.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the intermediate table and columns' device memory.
Returns
The exact number of output when performing a full join between two tables with build and probe as the the join keys .

◆ inner_join()

std::pair<std::unique_ptr<rmm::device_uvector<size_type> >, std::unique_ptr<rmm::device_uvector<size_type> > > cudf::hash_join::inner_join ( cudf::table_view const &  probe,
std::optional< std::size_t >  output_size = {},
rmm::cuda_stream_view  stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
) const

Returns the row indices that can be used to construct the result of performing an inner join between two tables.

See also
cudf::inner_join(). Behavior is undefined if the provided output_size is smaller than the actual output size.
Parameters
probeThe probe table, from which the tuples are probed.
output_sizeOptional value which allows users to specify the exact output size.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table and columns' device memory.
Returns
A pair of columns [left_indices, right_indices] that can be used to construct the result of performing an inner join between two tables with build and probe as the the join keys .

◆ inner_join_size()

std::size_t cudf::hash_join::inner_join_size ( cudf::table_view const &  probe,
rmm::cuda_stream_view  stream = rmm::cuda_stream_default 
) const

Returns the exact number of matches (rows) when performing an inner join with the specified probe table.

Parameters
probeThe probe table, from which the tuples are probed.
streamCUDA stream used for device memory operations and kernel launches
Returns
The exact number of output when performing an inner join between two tables with build and probe as the the join keys .

◆ left_join()

std::pair<std::unique_ptr<rmm::device_uvector<size_type> >, std::unique_ptr<rmm::device_uvector<size_type> > > cudf::hash_join::left_join ( cudf::table_view const &  probe,
std::optional< std::size_t >  output_size = {},
rmm::cuda_stream_view  stream = rmm::cuda_stream_default,
rmm::mr::device_memory_resource mr = rmm::mr::get_current_device_resource() 
) const

Returns the row indices that can be used to construct the result of performing a left join between two tables.

See also
cudf::left_join(). Behavior is undefined if the provided output_size is smaller than the actual output size.
Parameters
probeThe probe table, from which the tuples are probed.
output_sizeOptional value which allows users to specify the exact output size.
streamCUDA stream used for device memory operations and kernel launches
mrDevice memory resource used to allocate the returned table and columns' device memory.
Returns
A pair of columns [left_indices, right_indices] that can be used to construct the result of performing a left join between two tables with build and probe as the the join keys .

◆ left_join_size()

std::size_t cudf::hash_join::left_join_size ( cudf::table_view const &  probe,
rmm::cuda_stream_view  stream = rmm::cuda_stream_default 
) const

Returns the exact number of matches (rows) when performing a left join with the specified probe table.

Parameters
probeThe probe table, from which the tuples are probed.
streamCUDA stream used for device memory operations and kernel launches
Returns
The exact number of output when performing a left join between two tables with build and probe as the the join keys .

The documentation for this class was generated from the following file: