Merge Join

Merge Join

Description

The Merge Join transform performs a classic merge join between data sets with data coming from two different input transforms.

This transform assumes your data is sorted on the join keys. Use Sort Rows transforms on the incoming streams to enforce sorting if necessary.

Join options include INNER, LEFT OUTER, RIGHT OUTER, and FULL OUTER.

Hop Engine

Spark

Flink

Dataflow

Options

Option
Description

First Transform

The first transform to read data from (left hand side of the join)

Second Transform

The second transform to read data from (right hand side of the join)

Join type

The join type that should be used; INNER, LEFT OUTER, RIGHT OUTER, and FULL OUTER

Key Field

The fields used for the join key, this only supports equal joins (key first transform = key second transform)

For guidance on preventing deadlocks when using the Merge Join transform, refer to this how-to guide: Avoiding deadlocks.

Last updated