Filtered hash join that builds a hash table from the right (filter) table on creation and probes results in subsequent *_join member functions.
More...
#include <filtered_join.hpp>
Public Member Functions | |
| filtered_join (filtered_join const &)=delete | |
| filtered_join (filtered_join &&)=delete | |
| filtered_join & | operator= (filtered_join const &)=delete |
| filtered_join & | operator= (filtered_join &&)=delete |
| filtered_join (cudf::table_view const &build, cudf::null_equality compare_nulls, rmm::cuda_stream_view stream) | |
| Constructs a filtered hash join object for subsequent probe calls. More... | |
| filtered_join (cudf::table_view const &build, cudf::null_equality compare_nulls, double load_factor, rmm::cuda_stream_view stream) | |
| Constructs a filtered hash join object for subsequent probe calls. More... | |
| filtered_join (cudf::table_view const &build, cudf::null_equality compare_nulls, set_as_build_table reuse_tbl, rmm::cuda_stream_view stream) | |
| Constructs a filtered hash join object for subsequent probe calls. More... | |
| filtered_join (cudf::table_view const &build, null_equality compare_nulls, set_as_build_table reuse_tbl, double load_factor, rmm::cuda_stream_view stream) | |
| Constructs a filtered hash join object for subsequent probe calls. More... | |
| std::unique_ptr< rmm::device_uvector< size_type > > | semi_join (cudf::table_view const &probe, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref()) const |
| Returns a vector of row indices corresponding to a semi-join between the specified tables. More... | |
| std::unique_ptr< rmm::device_uvector< size_type > > | anti_join (cudf::table_view const &probe, rmm::cuda_stream_view stream=cudf::get_default_stream(), rmm::device_async_resource_ref mr=cudf::get_current_device_resource_ref()) const |
| Returns a vector of row indices corresponding to an anti-join between the specified tables. More... | |
Filtered hash join that builds a hash table from the right (filter) table on creation and probes results in subsequent *_join member functions.
This class enables the filtered hash join scheme that builds a hash table once from the right table, and probes as many times as needed (possibly in parallel) with different left tables. The right table acts as the filter to be applied on left tables in subsequent *_join operations. The underlying data structure is cuco::static_set.
For use cases where the left table should be reused with multiple right tables, use cudf::mark_join instead.
Definition at line 59 of file filtered_join.hpp.
| cudf::filtered_join::filtered_join | ( | cudf::table_view const & | build, |
| cudf::null_equality | compare_nulls, | ||
| rmm::cuda_stream_view | stream | ||
| ) |
Constructs a filtered hash join object for subsequent probe calls.
The build table is always treated as the right (filter) table. It will be applied to multiple left (probe) tables in subsequent semi_join or anti_join calls.
| build | The right (filter) table used to build the hash table |
| compare_nulls | Controls whether null join-key values should match or not |
| stream | CUDA stream used for device memory operations and kernel launches |
| cudf::filtered_join::filtered_join | ( | cudf::table_view const & | build, |
| cudf::null_equality | compare_nulls, | ||
| double | load_factor, | ||
| rmm::cuda_stream_view | stream | ||
| ) |
Constructs a filtered hash join object for subsequent probe calls.
The build table is always treated as the right (filter) table. It will be applied to multiple left (probe) tables in subsequent semi_join or anti_join calls.
| build | The right (filter) table used to build the hash table |
| compare_nulls | Controls whether null join-key values should match or not |
| load_factor | The desired ratio of filled slots to total slots in the hash table, must be in range (0,1]. For example, 0.5 indicates a target of 50% occupancy. Note that the actual occupancy achieved may be slightly lower than the specified value. |
| stream | CUDA stream used for device memory operations and kernel launches |
| cudf::filtered_join::filtered_join | ( | cudf::table_view const & | build, |
| cudf::null_equality | compare_nulls, | ||
| set_as_build_table | reuse_tbl, | ||
| rmm::cuda_stream_view | stream | ||
| ) |
Constructs a filtered hash join object for subsequent probe calls.
| build | The build table |
| compare_nulls | Controls whether null join-key values should match or not |
| reuse_tbl | Specifies which table to use as the build table. Only RIGHT is supported. |
| stream | CUDA stream used for device memory operations and kernel launches |
| cudf::filtered_join::filtered_join | ( | cudf::table_view const & | build, |
| null_equality | compare_nulls, | ||
| set_as_build_table | reuse_tbl, | ||
| double | load_factor, | ||
| rmm::cuda_stream_view | stream | ||
| ) |
Constructs a filtered hash join object for subsequent probe calls.
| build | The build table |
| compare_nulls | Controls whether null join-key values should match or not |
| reuse_tbl | Specifies which table to use as the build table. Only RIGHT is supported. |
| load_factor | The desired ratio of filled slots to total slots in the hash table, must be in range (0,1]. For example, 0.5 indicates a target of 50% occupancy. Note that the actual occupancy achieved may be slightly lower than the specified value. |
| stream | CUDA stream used for device memory operations and kernel launches |
| std::unique_ptr<rmm::device_uvector<size_type> > cudf::filtered_join::anti_join | ( | cudf::table_view const & | probe, |
| rmm::cuda_stream_view | stream = cudf::get_default_stream(), |
||
| rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
| ) | const |
Returns a vector of row indices corresponding to an anti-join between the specified tables.
The returned vector contains the row indices from the probe (left) table for which there are no matching rows in the build (right/filter) table.
| probe | The probe (left) table |
| stream | CUDA stream used for device memory operations and kernel launches |
| mr | Device memory resource used to allocate the returned table and columns' device memory |
left_indices that can be used to construct the result of performing a left anti join | std::unique_ptr<rmm::device_uvector<size_type> > cudf::filtered_join::semi_join | ( | cudf::table_view const & | probe, |
| rmm::cuda_stream_view | stream = cudf::get_default_stream(), |
||
| rmm::device_async_resource_ref | mr = cudf::get_current_device_resource_ref() |
||
| ) | const |
Returns a vector of row indices corresponding to a semi-join between the specified tables.
The returned vector contains the row indices from the probe (left) table for which there is a matching row in the build (right/filter) table.
| probe | The probe (left) table |
| stream | CUDA stream used for device memory operations and kernel launches |
| mr | Device memory resource used to allocate the returned table and columns' device memory |
left_indices that can be used to construct the result of performing a left semi join