DataOneVFSReader
Short Description
The main purpose of DataOneVFSReader is to stream the content of an input file residing on Data One VFS to the downstream Data Shaper graph, together with a set of file metadata, namely DataOneFileDescriptor (DOFD), which are documented at this link.
The processing of files residing in Data One Core virtual file system (VFS) is performed by developing Data Shaper graphs that contain a specialized reader graph component and a writer graph component, namely DataOneVFSReader and DataOneVFSWriter.
The resulting graph performs the required processing on the file by combining DataOneVFSReader and DataOneVFSWriter with any of the other components available in Data Shaper Designer palette.
The graph itself is invoked from within a mediation contract by using the specialized service task named Data Shaper Processor that submits the graphs and monitors its outcome.
Data Shaper developers can expose several parameters at graph level, namely graph parameters. Such parameters (namely Data One Contract Parameters, DOP_*) will be configured by Data One user at mediation contract configuration time and dynamically injected at graph execution time together with another set of implicitly injected context attributes (namely Data One Context Attributes, DOX_*). Details about DOP_* et al. exposed graph parameters can be found here. Details about DOX_* attributes can be found here.
Data Shaper graphs based on DataOneVFSReader and DataOneVFSWriter are not constrained to process a single input file and produce a single output file. Both fan-out and fan-in data processing graphs can be easily implemented using the approach described above.
In the next few sections, a detailed discussion of each of the components highlighted above will be provided.
COMPONENT | DATA SOURCE | INPUT PORTS | OUTPUT PORTS | EACH TO ALL INPUTS | DIFFERENT TO DIFFERENT OUTPUTS | TRANSFORMATION | TRANSF. REQ. | JAVA | CTL | AUTO-PROPAGATED METADATA |
---|---|---|---|---|---|---|---|---|---|---|
DataOneVFSReader | Data One | 0 | 1-2 | x | ✓ | x | x | x | x | ✓ |
Ports
PORT TYPE | NUMBER | NAME | REQ. | DESCRIPTION | METADATA |
---|---|---|---|---|---|
Output | 0 | content | ✓ | File data content, streamed in chunks to the downstream graph | Metadata must be defined on the edge |
Output | 1 | metadata | Data One File Descriptor metadata propagated once per file right after having opened the input Data One file streamed in chunks to the downstream graph | Data One File Descriptor. Read this documentation |
DataOneVFSReader Attributes
ATTRIBUTE | REQ. | DESCRIPTION | POSSIBLE VALUES |
---|---|---|---|
Basic | |||
Fileset ID | ✓ | Unique identifier of the input file in Data One, this parameter is dynamically injected at runtime via Data One Contract. By default this attribute will be implicitly set to the value of ${DOP_FILESET_ID} graph parameter, if defined. | |
Advanced | |||
File Resource name | Name of a Data One File Resource to be applied to file content before streaming it to the output port. Data One File Resource are definition of basic data processing operations for file content, examples include: line/record terminator character (EOR) conversion, charset conversion, compression/decompression, encryption/decryption. By default this attribute will be implicitly set to the value of ${DOP_FILERES_INPUT} graph parameter, if defined. | ||
Required charset | Specific charset that the file payload retrieved from Data One file registry must have. Fixing this parameter to a specific value enables any graph components wired downstream of DataOneVFSReader to handle data in a known charset. If there is a File Resource specified (see previous attribute), this parameter will be matched against the resulting charset from File Resource application. If there isn’t any File Resource specified, this parameter will be matched against the native charset of the file as reported in its Data One file registry descriptor. If the input file is expected to be binary the charset should be set to NONE. When the expected charset does not match the charset detected in the input file (possibly after File Resource application, as explained above) the processing stops and a specific error is returned. By default this attribute will be implicitly set to the value of ${DOP_CHARSET} graph parameter, if defined and not empty, or to UTF-8 otherwise. Please notice that this parameter simply causes the enforcement of a constraint on the input file charset and NOT the charset conversion of its payload, which is instead performed by File Resource, when specified. | NONE | UTF-8 | |
Updated about 1 year ago