Stream Lookup
Stream Lookup
Description
The Stream Lookup transform type allows you to look up data using information coming from other transforms in the pipeline.
The data coming from the Source transform is first read into memory and is then used to look up data from the main stream.
Tip
Since this transform loads the lookup data into memory, it can be an extremely fast way to look up data. However, the entire lookup data set needs to fit in your available memory.
Hop Engine
Spark
Flink
Dataflow
Options
Transform name
Name of the transform this name has to be unique in a single pipeline
Lookup Transform
The Transform name where the lookup data is coming from
The keys to lookup…
Allows you to specify the names of the fields that are used to look up values. Values are always searched using the "equal" comparison
Fields to retrieve
You can specify the names of the fields to retrieve here, as well as the default value in case the value was not found or a new field name in case you didn’t like the old one.
Please note that since version 2.15.0 this transform is capable of resolving variable expressions for the default value. This means that in the unlikely case you used variable expressions in a previous version, this transform will now try to resolve them.
Preserve memory
Encodes rows of data to preserve memory while sorting. (Technical background: Hop will store the lookup data as raw bytes in a custom storage object that uses a hashcode of the bytes as the key. More CPU cost related to calculating the hashcode, less memory needed.)
Key and value are exactly one integer field
Preserves memory while executing a sort by . Note: Works only when "Preserve memory" is checked. Cannot be combined with the "Use sorted list" option.
Use sorted list
Enable to store values using a sorted list; this provides better memory usage when working with data sets containing wide row. Note: Works only when "Preserve memory" is checked. Cannot be combined with the "Key and value are exactly one integer field" option.
Get fields
Automatically fills in the names of all the available fields on the source side (A); you can then delete all the fields you don’t want to use for lookup.
Get lookup fields
Automatically inserts the names of all the available fields on the lookup side (B). You can then delete the fields you don’t want to retrieve
For guidance on preventing deadlocks when using the Stream Lookup transform, refer to this how-to guide: Avoiding deadlocks
Last updated 2025-09-04 18:23:13 +0200
Last updated