Common properties of components
Some properties are common for all components or at least for most of them:
-
You can choose which components should be displayed in the Palette of Components and which should be removed from there (Components in Palette).
-
Each component can be set up using Edit Component Dialog (Edit Component Dialog).
Among the properties that can be set in this Edit Component Dialog, the following are described in more detail:
-
Each component has a label with Component name (Component Name).
-
Each graph can be processed in phases (Phases).
-
Components can be disabled (Enable/Disable Component).
-
Components can have specified on which Cluster nodes they will be executed (Component Allocation).
Component Name
Each component has a label which can be changed. Since you can have multiple components in a graph, each with specified function, you can name them accordingly for easier reference.
You can rename any component in one of the following four ways:
-
In the Edit component dialog by specifying the Component name attribute.
-
In the Properties tab by specifying the Component name attribute.
-
By highlighting and clicking it.
If you highlight any component (by clicking the component itself or by clicking its item in the Outline pane), a hint appears showing the name of the component. After clicking the component, a rectangle appears below the component, showing the Component name on a blue background. You can change the name shown in this rectangle and press Enter.
- You can right-click the component and select Rename from the context menu. After that, the same rectangle as mentioned above appears below the component. You can rename the component in the way described above.
Phases
Each graph can be divided into several phases by setting the phase numbers on components. You can see this phase number in the upper left corner of every component.
The meaning of a phase is that each graph runs in parallel within the same phase number; i.e. each component and each edge that have the same phase number run simultaneously. If the process stops within some phase, higher phases do not start. Only after all processes within one phase terminate successfully, will the next phase start.
That is why phases must remain the same while a graph is running. They cannot descend.
So, when you increase some phase number on any of the graph components, all components with the same phase number (unlike those with higher phase numbers) lying further along the graph change their phase to this new value automatically.
You can select more components and set their phase number(s). Either you set the same phase number for all selected components or you can choose the step by which the phase number of each individual component should be incremented or decremented.
To do that, use the following Phase setting wizard:
Component Allocation
This attribute is taken into account only on the Data Shaper Cluster environment.
The Allocation attribute is common for all Components. This attribute is used for Cluster graph processing to plan how many instances of a component will be executed and on which Cluster nodes. Allocation is our basic concept for parallelization of data processing and inter-Cluster-node data routing.
Allocation can be specified in three different ways:
-
based on number of workers - the component will be executed in requested instances on some Cluster nodes, which are preferred by Data Shaper Cluster;
-
based on a reference on a partitioned sandbox - the component will be executed on all Cluster nodes where the partitioned sandbox has a location;
This allocation type is transparently used as a default for most of data readers and data writers which refer to a file in a partitioned sandbox.
- allocation defined by a list of Cluster node identifiers (a Cluster node can be used more times)
Allocation is automatically inherited from neighboring components. Therefore, continuous graph may have only a single component with an allocation and this allocation is used by all other components as well. All components of Clustered graphs are decorated by the number of instances (x3) in which the component will be finally executed - so called allocation cardinality. These annotations are updated on a graph save operation. Allocation cardinality derived from neighbors is indicated in gray italic font and the cardinality derived from an allocation defined right on the component is printed out with a solid font.
Two interconnected components have to have compatible allocations - the number of specified workers has to be equal. The only exception from this rule are Cluster components, which are dedicated just to change the level of parallelism. Parallel Partitioners change a single-worker allocation to multi-worker allocation. On the other hand, Parallel Gatherers change a multi-worker allocation to single-worker allocation.
For more details about Clustered graph processing, see Data Partitioning in Cluster.
Updated 6 months ago