# File Metadata

## ![](/files/eLXqIO7XzC7rorLVH4SR) File Metadata

### Description <a href="#description" id="description"></a>

The File Metadata transform scans a file to determine its metadata structure or layout.

Use this transforms in situations where you need to read a structured text file (e.g. CSV, TSV) when you don’t know the exact layout in advance.

The information provided in this file can be used to actually read the file later, e.g. through metadata injection.

The layout detected depends on the number of rows scanned.

For example, if the first 100 rows of a file are scanned and the first field is detected as an integer, there is a possibility this field contains alphanumerical characters in later rows. Using 0 rows for 'limit scanned rows' is a way to make sure the entire file is scanned and the layout is detected correctly, even though this may be time consuming or even impossible for large files.

| Hop Engine | <sup>✓</sup> |
| ---------- | ------------ |
| Spark      | ?            |
| Flink      | ?            |
| Dataflow   | ?            |

### Options

| Option               | Description                                                                                                  |
| -------------------- | ------------------------------------------------------------------------------------------------------------ |
| Transform name       | The name for this transform                                                                                  |
| Filename             | The filename to scan for metadata                                                                            |
| Filename in a field? | Enable to specify the file name(s) in a field in the input stream                                            |
| Filename field       | When the previous option is enabled, you can specify the field that will contain the filename(s) at runtime. |
| Limit scanned rows   | The number of rows to limit the scan to (default 10,000). Use 0 rows to scan the entire file.                |
| Fallback charset     | Charset to use while scanning the file                                                                       |
| Delimiter candidates | List of delimiters to try while detecting the file layout. Tab, semicolon, comma are provided by default.    |
| Enclosure candidates | List of delimiters to try while detecting the file layout. Double and single quote are provided by default.  |

### Output Fields

The fields returned by this transform are

* charset
* delimiter
* field\_count
* skip\_header\_lines
* skip\_footer\_lines
* header\_line\_present
* name
* type
* length
* precision
* mask
* decimal\_symbol
* grouping\_symbol


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.primeur.com/data-shaper-1.21/knowing-the-data-shaper-designer/pipelines/transforms/filemetadata.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
