join#

class pylibcudf.join.FilteredJoin#

Filtered hash join that builds a hash table from the right (filter) table on creation and probes results in subsequent join member functions.

The build table is always treated as the right (filter) table. It will be applied to multiple left (probe) tables in subsequent semi_join or anti_join calls. For use cases where the left table should be reused with multiple right tables, use MarkJoin instead.

For details, see cudf::filtered_join.

Methods

anti_join(self, Table probe, ...)

Returns a column of row indices corresponding to an anti-join between the build table and probe table.

semi_join(self, Table probe, ...)

Returns a column of row indices corresponding to a semi-join between the build table and probe table.

anti_join(self, Table probe, Stream stream=None, DeviceMemoryResource mr=None)#

Returns a column of row indices corresponding to an anti-join between the build table and probe table.

For details, see cudf::filtered_join::anti_join().

Parameters:
probeTable

The probe table.

streamStream, optional

CUDA stream used for device memory operations and kernel launches.

mrDeviceMemoryResource, optional

Device memory resource used to allocate the returned column’s device memory.

Returns:
Column

A column containing the row indices from the left table after the join.

semi_join(self, Table probe, Stream stream=None, DeviceMemoryResource mr=None)#

Returns a column of row indices corresponding to a semi-join between the build table and probe table.

For details, see cudf::filtered_join::semi_join().

Parameters:
probeTable

The probe table.

streamStream, optional

CUDA stream used for device memory operations and kernel launches.

mrDeviceMemoryResource, optional

Device memory resource used to allocate the returned column’s device memory.

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.SetAsBuildTable#

See also set_as_build_table.

Enum members

  • LEFT

  • RIGHT

pylibcudf.join.conditional_full_join(Table left, Table right, Expression binary_predicate, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a conditional full join between two tables.

For details, see conditional_full_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.conditional_inner_join(Table left, Table right, Expression binary_predicate, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a conditional inner join between two tables.

For details, see conditional_inner_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.conditional_left_anti_join(Table left, Table right, Expression binary_predicate, Stream stream=None, DeviceMemoryResource mr=None) Column#

Perform a conditional left anti join between two tables.

For details, see conditional_left_anti_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.conditional_left_join(Table left, Table right, Expression binary_predicate, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a conditional left join between two tables.

For details, see conditional_left_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.conditional_left_semi_join(Table left, Table right, Expression binary_predicate, Stream stream=None, DeviceMemoryResource mr=None) Column#

Perform a conditional left semi join between two tables.

For details, see conditional_left_semi_join().

Parameters:
leftTable

The left table to join.

rightTable

The right table to join.

binary_predicateExpression

Condition to join on.

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.cross_join(Table left, Table right, Stream stream=None, DeviceMemoryResource mr=None) Table#

Perform a cross join on two tables.

For details see cross_join().

Parameters:
leftTable

The left table to join.

right: Table

The right table to join.

streamStream | None

CUDA stream on which to perform the operation.

mrDeviceMemoryResource | None

Device memory resource used to allocate the returned table’s device memory.

Returns:
Table

The result of cross joining the two inputs.

pylibcudf.join.full_join(Table left_keys, Table right_keys, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a full join between two tables.

For details, see full_join().

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.inner_join(Table left_keys, Table right_keys, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform an inner join between two tables.

For details, see inner_join().

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.left_anti_join(Table left_keys, Table right_keys, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) Column#

Perform a left anti join between two tables.

For details, see cudf::filtered_join.

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.left_join(Table left_keys, Table right_keys, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a left join between two tables.

For details, see left_join().

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.left_semi_join(Table left_keys, Table right_keys, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) Column#

Perform a left semi join between two tables.

For details, see cudf::filtered_join.

Parameters:
left_keysTable

The left table to join.

right_keysTable

The right table to join.

nulls_equalNullEquality

Should nulls compare equal?

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.mixed_full_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a mixed full join between two tables.

For details, see mixed_full_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.mixed_inner_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a mixed inner join between two tables.

For details, see mixed_inner_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.mixed_left_anti_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) Column#

Perform a mixed left anti join between two tables.

For details, see mixed_left_anti_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Column

A column containing the row indices from the left table after the join.

pylibcudf.join.mixed_left_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) tuple#

Perform a mixed left join between two tables.

For details, see mixed_left_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Tuple[Column, Column]

A tuple containing the row indices from the left and right tables after the join.

pylibcudf.join.mixed_left_semi_join(Table left_keys, Table right_keys, Table left_conditional, Table right_conditional, Expression binary_predicate, null_equality nulls_equal, Stream stream=None, DeviceMemoryResource mr=None) Column#

Perform a mixed left semi join between two tables.

For details, see mixed_left_semi_join().

Parameters:
left_keysTable

The left table to use for the equality join.

right_keysTable

The right table to use for the equality join.

left_conditionalTable

The left table to use for the conditional join.

right_conditionalTable

The right table to use for the conditional join.

binary_predicateExpression

Condition to join on.

nulls_equalNullEquality

Should nulls compare equal in the equality join?

Returns:
Column

A column containing the row indices from the left table after the join.