Execution Data Profile
Execution Data Profile
Description
An Apache Hop Execution Data Profile builds data profiles as data flow through pipelines. A number of data profilers can be selected and configure the fine tune the type and detail of the data that is profiled.
Options
Name
The name to be used for this Execution Data Profile
Description
A description to be used for this Execution Data Profile
Data Samplers to use
One or more data samplers to use with this Execution Data Profile. See details below.
Data Samplers
Data profile output rows
Allow for some basic data profiling to be performed on transform output rows
Sample size: This is the maximum number of sample rows kept for any discovered profiling result (default: 25)
Last transforms only: only perform data profiling on pipeline endpoints (last transforms)? (default: true)
Minima: store the minimum value for this data profile (default: true)
Maxima: store the maximum value for this data profile (default: true)
Count nulls: count null values for this data profile (default: true)
Count non-nulls: count non-null values for this data profile (default: true)
Min length: store the minimum lengths for this data profile (default: true)
Max length: store the maximum lengths for this data profile (default: true)
First output rows
Samples the first rows of a transform output
Sample size (default: 100)
Last output rows
Samples the last rows of a transform output
Sample size (default: 100)
Random output rows
Do reservoir sampling on the output rows of a transform
Sample size (default: 100)
Last updated