Primeur Online Docs
Data Shaper
Data Shaper
  • 🚀GETTING STARTED
    • What is Primeur Data Shaper
      • What is the Data Shaper Designer
      • What is the Data Shaper Server
      • What is the Data Shaper Cluster
    • How does the Data Shaper Designer work
      • Designer Views and Graphs
      • Data Shaper Graphs
      • Designer Projects and Sandboxes
      • Data Shaper Designer Reference
    • How do the Data Shaper Server and Cluster work
      • Data Shaper Server and Cluster
      • Data Shaper Server Reference
    • VFS Graph Components
      • DataOneFileDescriptor (DOFD) metadata
      • Passing parameters from Data One Contract to Data Shaper graph
      • Inheriting Data One context attributes in Data Shaper graph
  • DATA SHAPER DESIGNER
    • Configuration
      • Runtime configuration
        • Logging
        • Master Password
        • User Classpath
      • Data Shaper Server Integration
      • Execution monitoring
      • Java configuration
      • Engine configuration
      • Refresh Operation
    • Designer User Interface
      • Graph Editor with Palette of Components
      • Project Explorer Pane
      • Outline Pane
      • Tabs Pane
      • Execution Tab
      • Keyboard Shortcuts
    • Projects
      • Creating Data Shaper projects
      • Converting Data Shaper projects
      • Structure of Data Shaper projects
      • Versioning of server project content
      • Working with Data Shaper Server Projects
      • Project configuration
    • Graphs
      • Creating an empty graph
      • Creating a simple graph
        • Placing Components
        • Placing Components from Palette
        • Connecting Components with Edges
    • Execution
      • Successful Graph Execution
      • Run configuration
      • Connecting to a running job
      • Graph states
    • Common dialogs
      • URL file dialog
      • Edit Value dialog
      • Open Type dialog
    • Import
      • Import Data Shaper projects
      • Import from Data Shaper server sandbox
      • Import graphs
      • Import metadata
    • Export
      • Export graphs to HTML
      • Export to Data Shaper Server sandbox
      • Export image
      • Export Project as Library
    • Graph tracking
      • Changing record count font size
    • Search functionality
    • Working with Data Shaper server
      • Data Shaper server project basic principles
      • Connecting via HTTP
      • Connecting via HTTPS
      • Connecting via Proxy Server
    • Graph components
      • Adding components
      • Finding components
      • Edit component dialog box
      • Enable/disable component
      • Passing data through disabled component
      • Common properties of components
      • Specific attribute types
      • Metadata templates
    • Edges
      • Connecting Components with Edges
      • Types of Edges
      • Assigning Metadata to Edges
      • Colors of Edges
      • Debugging Edges
      • Edge Memory Allocation
    • Metadata
      • Records and Fields
        • Record Types
        • Data Types in Metadata
        • Data Formats
        • Locale and Locale Sensitivity
        • Time Zone
        • Autofilling Functions
      • Metadata Types
        • Internal Metadata
        • External (Shared) Metadata
        • SQL Query Metadata
        • Reading Metadata from Special Sources
      • Auto-propagated Metadata
        • Sources of Auto-Propagated Metadata
        • Explicitly Propagated Metadata
        • Priorities of Metadata
        • Propagation of SQL Query Metadata
      • Creating Metadata
        • Extracting Metadata from a Flat File
        • Extracting Metadata from an XLS(X) File
        • Extracting Metadata from a Database
        • Extracting Metadata from a DBase File
        • Extracting Metadata from Salesforce
        • SQL Query Metadata
        • User Defined Metadata
      • Merging Existing Metadata
      • Creating Database Table from Metadata and Database Connection
      • Metadata Editor
        • Opening Metadata Editor
        • Basics of Metadata Editor
        • Record Pane
        • Field Name vs. Label vs. Description
        • Details Pane
      • Changing and Defining Delimiters
      • Editing Metadata in the Source Code
      • Multi-value Fields
        • Lists and Maps Support in Components
        • Joining on multivalue fields (Comparison Rules)
    • Connections
      • Database Connections
        • Internal Database Connections
        • External (Shared) Database Connections
        • Database Connections Properties
        • Encryption of Access Password
        • Browsing Database and Extracting Metadata from Database Tables
        • Windows Authentication on Microsoft SQL Server
        • Snowflake Connection
        • Hive Connection
        • Troubleshooting
      • JMS Connections
      • QuickBase Connections
      • Hadoop Connections
      • Kafka Connections
      • OAuth2 Connections
      • MongoDB Connections
      • Salesforce Connections
    • Lookup Tables
      • Lookup Tables in Cluster Environment
      • Internal Lookup Tables
      • External (Shared) Lookup Tables
      • Types of Lookup Tables
    • Sequences
      • Persistent Sequences
      • Non Persistent Sequences
      • Internal Sequences
      • External (Shared) Sequences
      • Editing a Sequence
      • Sequences in Cluster Environment
    • Parameters
      • Internal Parameters
      • External (Shared) Parameters
      • Secure Graph Parameters
      • Graph Parameter Editor
      • Secure Graph Parameters
      • Parameters with CTL2 Expressions (Dynamic Parameters)
      • Environment Variables
      • Canonicalizing File Paths
      • Using Parameters
    • Internal/External Graph Elements
    • Dictionary
      • Creating a Dictionary
      • Using a Dictionary in Graphs
    • Execution Properties
    • Notes in Graphs
      • Placing Notes into Graph
      • Resizing Notes
      • Editing Notes
      • Formatted Text
      • Links from Notes
      • Folding Notes
      • Notes Properties
    • Transformations
      • Defining Transformations
      • Transform Editor
      • Common Java Interfaces
    • Data Partitioning (Parallel Running)
    • Data Partitioning in Cluster
      • High Availability
      • Scalability
      • Graph Allocation Examples
      • Example of Distributed Execution
      • Remote Edges
    • Readers
      • Common Properties of Readers
      • ComplexDataReader
      • DatabaseReader
      • DataGenerator
      • DataOneVFSReader
      • EDIFACTReader
      • FlatFileReader
      • JSONExtract
      • JSONReader
      • LDAPReader
      • MultiLevelReader
      • SpreadsheetDataReader
      • UniversalDataReader
      • X12Reader
      • XMLExtract
      • XMLReader
      • XMLXPathReader
    • Writers
      • Common Properties of Writers
      • DatabaseWriter
      • DataOneVFSWriter
      • EDIFACTWriter
      • FlatFileWriter
      • JSONWriter
      • LDAPWriter
      • SpreadsheetDataWriter
      • HIDDEN StructuredDataWriter
      • HIDDEN TableauWriter
      • Trash
      • UniversalDataWriter
      • X12Writer
      • XMLWriter
    • Transformers
      • Common Properties of Transformers
      • Aggregate
      • Concatenate
      • DataIntersection
      • DataSampler
      • Dedup
      • Denormalizer
      • ExtSort
      • FastSort
      • Filter
      • Map
      • Merge
      • MetaPivot
      • Normalizer
      • Partition
      • Pivot
      • Rollup
      • SimpleCopy
      • SimpleGather
      • SortWithinGroups
      • XSLTransformer
    • Joiners
      • Common Properties of Joiners
      • Combine
      • CrossJoin
      • DBJoin
      • ExtHashJoin
      • ExtMergeJoin
      • LookupJoin
      • RelationalJoin
    • Others
      • Common Properties of Others
      • CheckForeignKey
      • DBExecute
      • HTTPConnector
      • LookupTableReaderWriter
      • WebServiceClient
    • CTL2 - Data Shaper Transformation Language
    • Language Reference
      • Program Structure
      • Comments
      • Import
      • Data Types in CTL2
      • Literals
      • Variables
      • Dictionary in CTL2
      • Operators
      • Simple Statement and Block of Statements
      • Control Statements
      • Error Handling
      • Functions
      • Conditional Fail Expression
      • Accessing Data Records and Fields
      • Mapping
      • Parameters
      • Regular Expressions
    • CTL Debugging
      • Debug Perspective
      • Importing and Exporting Breakpoints
      • Inspecting Variables and Expressions
      • Examples
    • Functions Reference
      • Conversion Functions
      • Date Functions
      • Mathematical Functions
      • String Functions
      • Mapping Functions
      • Container Functions
      • Record Functions (Dynamic Field Access)
      • Miscellaneous Functions
      • Lookup Table Functions
      • Sequence Functions
      • Data Service HTTP Library Functions
      • Custom CTL Functions
      • CTL2 Appendix - List of National-specific Characters
      • HIDDEN Subgraph Functions
    • Tutorial
      • Creating a Transformation Graph
      • Filtering the records
      • Sorting the Records
      • Processing Speed-up with Parallelization
      • Debugging the Java Transformation
  • DATA SHAPER SERVER
    • Introduction
    • Administration
      • Monitoring
    • Using Graphs
      • Job Queue
      • Execution History
      • Job Inspector
    • Cluster
      • Sandboxes in Cluster
      • Troubleshooting
  • Install Data Shaper
    • Install Data Shaper
      • Introduction to Data Shaper installation process
      • Planning Data Shaper installation
      • Data Shaper System Requirements
      • Data Shaper Domain Master Configuration reference
      • Performing Data Shaper initial installation and master configuration
        • Creating database objects for PostgreSQL
        • Creating database objects for Oracle
        • Executing Data Shaper installer
        • Configuring additional firewall rules for Data Shaper
Powered by GitBook
On this page
  • Components used in Transformation
  • Java or CTL
  • Internal or External Definition
  • Return Values of Transformations
  1. DATA SHAPER DESIGNER
  2. Transformations

Defining Transformations

PreviousTransformationsNextTransform Editor

Last updated 1 month ago

For basic information about transformations, see . For a brief overview of transformations, see .

In this section, we are going to explain how to create transformations that change the data flowing through components. In particular:

  1. What components must be used to apply transformations.

  2. What language can be used to write transformations.

  3. Whether definition can be internal or external.

  4. What the return values of transformations are.

  5. What is the Transform editor and how to work with it.

  6. What interfaces are common for most transformation-allowing components.

Components used in Transformation

Transformations can be defined in the following components:

  • DataGenerator, Map, and Rollup These components require a transformation. You can define the transformation in Java or in the Data Shaper transformation language. In these components, different data records can be sent out through different output ports using the return values of the transformation. In order to send different records to different output ports, you must both create some mapping of the record to the corresponding output port and return the corresponding integer value. You can define the transformation in Java or in the Data Shaper transformation language. In order to send different records to different output ports or Cluster nodes, you must return the corresponding integer value but no mapping needs to be written in this component since all records are sent out automatically.

  • DataIntersection, Denormalizer, Normalizer, ExtHashJoin, ExtMergeJoin, LookupJoin, DBJoin and RelationalJoin These components require a transformation. You can define the transformation in Java or in the Data Shaper transformation language.

  • CustomJavaReader These components require a transformation. You can only write it in Java.

Java or CTL

Transformations can be written in Java or in the Data Shaper transformation language (CTL):

  • Java can be used in all components. Transformations executed in Java are faster than those written in CTL. Transformation can always be written in Java.

  • CTL is a very simple scripting language that can be used in most of the transforming components. CTL can be used even without any prior knowledge of Java.

Internal or External Definition

Each transformation can be defined as internal or external:

  • Internal transformation: An attribute like Transform, Denormalize, etc. must be defined. In such a case, the piece of code is written directly in the graph and can be seen in it.

  • External transformation: One of the following two kinds of attributes may be defined: - Transform URL, Denormalize URL, etc., for both Java and CTL. The code is written in an external file. Also charset of such external file can be specified (Transform source charset, Denormalize source charset, etc.). For transformations written in Java, a folder with transformation source code needs to be specified as source for Java compiler so that the transformation may be executed successfully. - Transform class, Denormalize class, etc. It is a compiled Java class. The class must be in classpath so that the transformation may be executed successfully.

This is a brief overview:

More details about defining transformations can be found in the sections concerning corresponding components. Both transformation functions (required and optional) of CTL templates and Java interfaces are described there.

Find here below an overview of transformation-allowing components.

Readers

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: x

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: ✓

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: x

  • EACH TO ALL OUTPUTS [1]: x

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: ✓

  • CTL TEMPLATE: --

Writers

Transformers

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: x

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: ✓

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: x

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: ✓

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: x

  • CTL: x

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • CTL TEMPLATE: --

  • JAVA INTERFACE: --

Joiners

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

  • TRANSFORMATION REQUIRED: ✓

  • JAVA: ✓

  • CTL: ✓

  • EACH TO ALL OUTPUTS [1]: --

  • DIFFERENT TO DIFFERENT OUTPUTS [2]: --

[1] If this is yes, each data record is always sent out through all connected output ports.

Return Values of Transformations

In components where transformations are defined, some return values may also be defined. These are integers greater than, equal to or less than 0.

Note: Remember that DBExecute can also return integer values less than 0 in form of SQLExceptions.

  • Positive or zero return values - ALL = Integer.MAX_VALUE In this case, the record is sent out through all output ports. Remember that this variable does not need to be declared before it is used. In CTL, ALL equals to 2147483647, in other words, it is Integer.MAX_VALUE. Both ALL and 2147483647 can be used. - OK = 0 In this case, the record is sent out through the single output port or output port 0 (if the component has multiple output ports, e.g. Map, Rollup). Remember that this variable does not need to be declared before being used. - Any other integer number greater than or equal to 0 In this case, the record is sent through the output port whose number is equal to this return value. These values can be called Mapping codes.

  • Negative return values - SKIP = - 1 This value is used to define that an error has occurred but the erroneous record will be skipped and the process will continue. Remember that this variable does not have to be declared before it is used. Both SKIP and -1 can be used. - STOP = - 2 This value is used to define that an error has occurred and the processing must be stopped. Remember that this variable does not have to be declared before it is used. Both STOP and -2 can be used.

Warning! The same return value is ERROR in CTL1. STOP can be used in CTL2.

- Any integer number less than or equal to -1 These values must be defined by user as described below. Their meaning is fatal error. These values can be called Error codes.

  1. Values greater than or equal to 0 Remember that all return values greater than or equal to 0 allow the same data record to be sent to the specified output ports only in the case of DataGenerator, Partition, Map and Rollup. Do not forget to define the mapping for each connected output port in DataGenerator, Map, and Rollup. In Partition (and clusterpartition), mapping is performed automatically. In the other components, this has no meaning. They have either a unique output port or their output ports are strictly defined for explicit outputs.

  2. Values less than -1 Remember that you do not call corresponding optional OnError() function of CTL template using these return values. To call any optional OnError(), you may use, for example, the following function: raiseError(string Arg). It throws an exception which is able to call such OnError(), e.g. transformOnError(), etc. Any other exception thrown by any () function calls corresponding OnError(), if this is defined.

  3. Values less than or equal to -2 Remember that if any of the functions that return integer values return a value less than or equal to -2 (including STOP), the getMessage() function is called (if defined). Therefore, to allow this function to be called, one or more return statements with values less than or equal to -2 must be added to the functions that return integer. For example, if any of the functions such as transform(), append() or count(), etc. returns -2, getMessage() is called and the message is written to Console.

Warning! Remember that if the graph fails with an exception or returning any negative value less than -1, no records will be written to the output file. If you want previously processed records to be written to the output, you must return SKIP (-1). In this way, those records will be skipped, the graph will not fail, and at least some records will be written to the output.

Transform, Denormalize, etc. To define a transformation in a graph itself, you must use the Transform editor. You can define a transformation located and visible in the graph itself. Transformation can be written in Java or CTL, as mentioned above. For more detailed information about the editor or the dialog, see or .

Transform URL, Denormalize URL, etc. You can also use a transformation defined in a source file outside a graph. To locate the transformation source file, use the . Each of the mentioned components can use this transformation definition. This file must contain the definition of the transformation written in either Java or CTL. In this case, transformation is located outside a graph. For more detailed information see .

Transform class, Denormalize class, etc. In all transforming components, you can use some compiled transformation class. To do that, use the Open Type wizard. In this case, the transformation is located outside the graph. For more detailed information, see .

:

CTL TEMPLATE:

JAVA INTERFACE:

:

JAVA INTERFACE:

:

CTL TEMPLATE:

JAVA INTERFACE:

:

CTL TEMPLATE:

JAVA INTERFACE:

:

CTL TEMPLATE:

JAVA INTERFACE:

:

CTL TEMPLATE:

JAVA INTERFACE:

:

CTL TEMPLATE:

JAVA INTERFACE:

CTL TEMPLATE:

JAVA INTERFACE:

CTL TEMPLATE:

JAVA INTERFACE:

CTL TEMPLATE:

JAVA INTERFACE:

CTL TEMPLATE:

JAVA INTERFACE:

CTL TEMPLATE:

JAVA INTERFACE:

[2] If this is yes, each data record can be sent out through the connected output port whose number is returned by the transformation. For more information, see .

Transform Editor
Edit value dialog
URL file dialog
URL file dialog
Open type dialog
DataGenerator
MultiLevelReader
DataIntersection
Map
Denormalizer
Normalizer
Rollup
DataSampler
ExtHashJoin
ExtMergeJoin
LookupJoin
DBJoin
RelationalJoin
Transformations
Transform Editor
Common Java Interfaces
Transformations Overview
Components used in Transformation
Java or CTL
Internal or External Definition
Return Values of Transformations
Return Values of Transformations
CTL Templates for Map
Java Interfaces for Map
Java Interfaces for MultiLevelReader
CTL Templates for Joiners
Java Interfaces for Joiners
CTL Templates for Joiners
Java Interfaces for Joiners
CTL Templates for Joiners
Java Interfaces for Joiners
CTL Templates for Joiners
Java Interfaces for Joiners
CTL Templates for Joiners
Java Interfaces for Joiners
CTL Templates for DataIntersection
Java Interfaces for DataIntersection
CTL Templates
Java Interface
CTL Templates for DataGenerator
Java Interface
CTL Templates for Rollup
Java Interface
CTL Templates for Normalizer
Java Interface