join#

class pylibcudf.join.FilteredJoin#

Filtered hash join that builds a hash table from the right (filter) table on creation and probes results in subsequent join member functions.

The right table is used as the filter applied to multiple left tables in subsequent semi_join or anti_join calls. For use cases where the left table should be reused with multiple right tables, use MarkJoin instead.

For details, see cudf::filtered_join.

Methods

anti_join(self, Table left[, stream])

Returns a column of row indices corresponding to an anti-join between the right (filter) table and left table.

semi_join(self, Table left[, stream])

Returns a column of row indices corresponding to a semi-join between the right (filter) table and left table.

anti_join(self, Table left, stream=None, DeviceMemoryResource mr=None)#

Returns a column of row indices corresponding to an anti-join between the right (filter) table and left table.

For details, see cudf::filtered_join::anti_join().

Parameters:
leftTable

The left table.

streamStream, optional

CUDA stream used for device memory operations and kernel launches.

mrDeviceMemoryResource, optional

Device memory resource used to allocate the returned column’s device memory.

Returns:
Column

A column containing the row indices from the left table after the join.

semi_join(self, Table left, stream=None, DeviceMemoryResource mr=None)#

Returns a column of row indices corresponding to a semi-join between the right (filter) table and left table.

For details, see cudf::filtered_join::semi_join().

Parameters:
leftTable

The left table.

streamStream, optional

CUDA stream used for device memory operations and kernel launches.

mrDeviceMemoryResource, optional

Device memory resource used to allocate the returned column’s device memory.

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.conditional_full_join(Table left, Table right, Expression binary_predicate, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a conditional full join between two tables.

For details, see conditional_full_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.conditional_inner_join(Table left, Table right, Expression binary_predicate, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a conditional inner join between two tables.

For details, see conditional_inner_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.conditional_left_anti_join(Table left, Table right, Expression binary_predicate, stream=None, DeviceMemoryResource mr=None) Column#

Perform a conditional left anti join between two tables.

For details, see conditional_left_anti_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.conditional_left_join(Table left, Table right, Expression binary_predicate, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a conditional left join between two tables.

For details, see conditional_left_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.conditional_left_semi_join(Table left, Table right, Expression binary_predicate, stream=None, DeviceMemoryResource mr=None) Column#

Perform a conditional left semi join between two tables.

For details, see conditional_left_semi_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.cross_join(Table left, Table right, stream=None, DeviceMemoryResource mr=None) Table#

Perform a cross join on two tables.

For details see cross_join().

Parameters:
leftTable

The left table to join.

right: Table

The right table to join.

streamStream | None

CUDA stream on which to perform the operation.

mrDeviceMemoryResource | None

Device memory resource used to allocate the returned table’s device memory.

Returns:
Table

The result of cross joining the two inputs.

pylibcudf.join.full_join(Table left_keys, Table right_keys, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a full join between two tables.

For details, see full_join().

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.inner_join(Table left_keys, Table right_keys, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) tuple#

Perform an inner join between two tables.

For details, see inner_join().

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.left_anti_join(Table left_keys, Table right_keys, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) Column#

Perform a left anti join between two tables.

For details, see cudf::filtered_join.

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.left_join(Table left_keys, Table right_keys, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a left join between two tables.

For details, see left_join().

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.left_semi_join(Table left_keys, Table right_keys, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) Column#

Perform a left semi join between two tables.

For details, see cudf::filtered_join.

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.mixed_full_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a mixed full join between two tables.

For details, see mixed_full_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.mixed_inner_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a mixed inner join between two tables.

For details, see mixed_inner_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.mixed_left_anti_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) Column#

Perform a mixed left anti join between two tables.

For details, see mixed_left_anti_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.mixed_left_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) tuple#

Perform a mixed left join between two tables.

For details, see mixed_left_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.mixed_left_semi_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, stream=None, DeviceMemoryResource mr=None) Column#

Perform a mixed left semi join between two tables.

For details, see mixed_left_semi_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Column

A column containing the row indices from the left table after the join.