JSONExtract

Short Description

JSONExtract reads data from JSON files using SAX technology. It can also read data from compressed files, input port, and dictionary.

COMPONENTDATA SOURCEINPUT PORTSOUTPUT PORTSEACH TO ALL INPUTSDIFFERENT TO DIFFERENT OUTPUTSTRANSFORMATIONTRANSF. REQ.JAVACTLAUTO-PROPAGATED METADATA
JSONExtractJSON file0-11-nxxxxxx

Ports

PORT TYPENUMBERREQUIREDDESCRIPTIONMETADATA
Input0xFor port reading. See Reading from Input Port in Supported File URL Formats for Readers.One field (byte, cbyte, string).
Output0For correct data recordsAny
1-n[blFor correct data recordsAny

[1] Other output ports are required if mapping requires that.

Metadata

JSONExtract does not propagate metadata.
JSONExtract has no metadata template.
Metadata on optional input port must contain string or byte or cbyte field.
Metadata on each output port does not need to be the same.
Each metadata can use Autofilling Functions.

JSONExtract Attributes

ATTRIBUTEREQ.DESCRIPTIONPOSSIBLE VALUES
Basic
File URLYesAttribute specifying what data source(s) will be read (JSON file, input port, dictionary). See Supported File URL Formats for Readers and Notes and Limitations below.
CharsetEncoding of records which are read.any encoding, default system one by default
Mapping[blMapping of the input JSON structure to output ports. For more information, see XMLExtract Mapping Definition.
Mapping URL[blName of an external file, including its path which defines mapping of the input JSON structure to output ports. For more information, see XMLExtract Mapping Definition.
Equivalent XML SchemaURL of a file that should be used for creating the Mapping definition. For more information, see JSONExtract Mapping Editor and XSD Schema.
Use nested nodesBy default, nested elements are also mapped to output ports automatically. If set to false, an explicit tag must be created for each such nested element. See Use nested nodes examples.true (default) | false
Trim stringsBy default, white spaces from the beginning and the end of the elements values are removed. If set to false, they are not removed.true (default) | false
Advanced
Number of skipped mappingsNumber of mappings to be skipped continuously throughout all source files. See Selecting Input Records.0-N
Max number of rows to outputMaximum number of records to be read continuously throughout all source files. See Selecting Input Records.0-N

[1] One of these must be specified. If both are specified, Mapping URL has higher priority.

Details

JSONExtract reads data from JSON files using SAX technology. This component is faster than JSONReader which can read JSON files too. JSONExtract does not use DOM, so it uses less memory than JSONReader.
JSONExtract can read lists.
JSONExtract can convert JSON to variant. Result variant can contain field/array values of following data types - null, string, boolean, long and number.


JSONExtract is very similar to XMLExtract. JSONExtract internally transforms JSON to XML and uses XMLExtract to parsing the data. Therefore, you can generate xsd file for corresponding xml file.

Mapping in JSONExtract is almost same as in XMLExtract. The main difference is, that JSON does not have attributes. For more information, see XMLExtract’s Details.

JSONExtract Mapping Editor and XSD Schema

JSONExtract Mapping Editor serves to set up mapping from JSON tree structure to one ore more output ports without the necessity of being aware how to create mapping of field using an XML editor.

To be able to use the editor, the editor needs to have created equivalent xsd schema. The equivalent xsd schema is created automatically. Only the directory for the schema needs to be specified.

Any other operations to set up mapping are described in above mentioned XMLExtract.

Mapping Input Fields to the Output Fields

In JSONExtract, you can map input fields to the output in the same way as you map JSON fields. The input field mapping works in all three processing modes.

Examples

Use nested nodes examples

It is important to bear in mind that when the Use nested nodes attribute is set to True, one should be careful if there are elements with the same name because the JSONExtract component might return a different value than expected. See the following examples for further explanation:

In this sample JSON file there are two elements called id: the first one is a nested element within the groups element, and the other one is nested within the main result element. The value of the first id is groupID, and the value of the other id is resultID.

{
    "root": {
      "result": {
        "groups": {
          "id": "groupID"
        },
        "isEnabled": true,
        "isMetric": true,
        "id": "resultID",
        "language": "en",
        "firstDayOfWeek": "Sunday",
        "name": "North America"
      }
    }
  }

Example 1: The mapping is at the level of the main result element, and the Automap elements or attributes to fields with same name option is turned on, or the id (resultID) is specifically mapped.

The value of the id element will differ based on if the Use nested nodes value is set to True or False:

  • When the Use nested nodes value is set to True, the returned record is groupID. This is because it is the first id element that is found when parsing the data.
  • When the Use nested nodes value is set to False, the returned record is resultID. In this case, the groups id element is ignored, and the first found id element is the one within the result element.

Example 2: The mapping is at the id element nested within the result element.

The returned values of the id element will again differ based on if the Use nested nodes value is set to True or False:

  • When the Use nested nodes value is set to True, there are two returned records: groupID and resultID.
  • When the Use nested nodes value is set to False, only the resultID record is returned.

Reading lists

JSON file contains information about employees and orders. Each item contains employee ID and list of order IDs.

{
  "jsonextract_order" : {
    "employee" : "Henri",
    "orders" : [ "order01", "order08", "order15" ]
  },
  "jsonextract_order" : {
    "employee" : "Jane",
    "orders" : [ "order02", "order05", "order09" ]
  }
}

Read data for further processing.

Solution
Use the File URL attribute to point to the source file and the Mapping attribute to define mapping.

Reading variants

JSON file contains information about employees and orders. Each item contains employee ID and list of order IDs.

{
  "jsonextract_order" : {
    "employee" : "Henri",
    "orders" : [ "order01", "order08", "order15" ]
  },
  "jsonextract_order" : {
    "employee" : "Jane",
    "orders" : [ "order02", "order05", "order09" ]
  }
}

Read data for further processing.

Solution
Use the File URL attribute to point to the source file and the Mapping attribute to define mapping.

Content of mapped output variant field depends on structure of input JSON.

// for the first input element
$out.0.employeeData["jsonextract_order"]["employee"]; // contains 'Henri'
$out.0.employeeData["jsonextract_order"]["orders"][0]; // contains 'order01'

Best Practices

We recommend users to explicitly specify Charset.

See also

JSONReader
JSONWriter
Common Properties of Components
Specific Attribute Types
Common Properties of Readers
Database Connections