Metadata Types
Metadata Types
Metadata is one of the cornerstones in Hop and can be defined as workflows, pipelines and any other type of metadata objects.
Hop Gui has a Metadata Perspective to manage all types of metadata: run configurations, database (relational and NoSQL) connections, logging, and pipeline probes just to name a few.
Metadata is typically stored as json files in a projects' metadata folder as a set of json files, in subfolders per metadata type. The only exception to the rule are workflows and pipelines, which are defined as XML (for now, because of historical reasons). Since workflows and pipelines are what Hop is all about, these are typically stored in your project folder, not in your project’s metadata folder.
By default, Hop contains the following metadata types:
- Asynchronous Web Service: Execute and query a workflow asynchronously through a web service. 
- Azure Blob Storage Authentication: A Azure Blob Storage connection type. 
- Beam File Definition: Describes a file layout in a Beam Pipeline 
- Cassandra Connection: Describes a connection to a Cassandra cluster 
- Data Set: This defines a data set, a static pre-defined collection of rows 
- Execution Data Profile: Collects and profiles data as it flows through a pipeline using configurable samplers for insight into value ranges, nulls, and row samples. 
- Execution Information Location: Defines where and how Apache Hop stores execution metadata, supporting local files, remote servers, Neo4j, or Elastic for later inspection and analysis. 
- Google Storage Authentication: A Google Cloud Storage connection type. 
- Hop Server: Defines a Hop Server 
- MongoDB Connection: Describes a MongoDB connection 
- Mail Server Connection: Describes a mail server connection 
- Neo4j Connection: A shared connection to a Neo4j server 
- Neo4j Graph Model: Description of the nodes, relationships, indexes, … of a Neo4j graph 
- Partition Schema: Describes a partition schema 
- Pipeline Log: Allows to log the activity of a pipeline with another pipeline 
- Pipeline Probe: Allows to stream output rows of a pipeline to another pipeline 
- Pipeline Run Configuration: Describes how and with which engine a pipeline is to be executed 
- Pipeline Unit Test: Describes a test for a pipeline with alternative data sets as input from a certain transform and testing output against golden data 
- Relational Database Connection: Describes all the metadata needed to connect to a relational database 
- REST Connection: Describes all the metadata needed to connect to a REST api. 
- Splunk Connection: Describes a Splunk connection 
- Static Schema Definition: Defines a reusable data stream layout to ensure consistency across multiple pipelines and simplify schema management. 
- Variable Resolver: Use plugins to resolve variable values with a pipeline, a key store, a vaults, or secret managers. 
- Web Service: Allows to run a pipeline to generate output for a servlet on Hop Server 
- Workflow Log: Allows to log the activity of a workflow with a pipeline 
- Workflow Run Configuration: Describes how to run a workflow 
Last updated
