Lookup Tables in Cluster Environment
To understand how lookup tables work in Cluster environment, it is necessary to understand how Clustered graphs are processed - split into several separate graphs and distributed among Cluster nodes. Details are available in the Parallel Data Processing section. In short, Clustered graph is executed in several instances according to a transformation plan, i.e. worker graphs. A transformation plan is the result of a transformation analysis, where component allocation, usage of partitioned sandbox and occurrences of Clustered components are taken into consideration. A transformation plan says how many instances of the graph, on which Cluster nodes will be executed. Moreover, it defines how the worker graphs should be updated for Clustered run, which components actually will be running in the particular worker and which will be removed.
Data Shaper Server Cluster environment does not provide any special support for lookup tables. Each Clustered graph instance creates its own set of lookup tables. Lookup tables instances do not cooperate with each other. So, for example, in the case of SimpleLookupTable, each instance of a Clustered graph has its own SimpleLookupTable instance which loads data from a specified data file separately. So data file is read by each Clustered graph and each instance has a separate set of cached records. DBLookupTable works seamlessly in Cluster environment - internal cache for databases responses is managed by each worker graph separately.
Be aware of writing data records into a lookup table using the LookupTableReaderWriter component. In this case, it is important to consider which worker does the writing, since the lookup table update is performed only locally. So ensure the LookupTableReaderWriter component runs on all workers where the update lookup will be necessary.
Updated about 1 year ago