# RegEx Rules

Java Regular Expressions, also known as RegEx, are sequences of characters that form a search pattern. A RegEX can be used to perform text searches as it defines a set of strings, usually united for a given purpose, i.e. describe what you are searching for.

In the following paragraphs, essential RegEx rules that you might find useful when compiling your RegEx are listed.

For the **complete RegEx syntax**, refer to the class Pattern of Oracle documentation available at this link: <https://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html>

## Character Classes

| \[abc]          | Matches a or b, or c.                             |
| --------------- | ------------------------------------------------- |
| \[^abc]         | Negation, matches everything except a or b, or c. |
| \[a-c]          | Range, matches a or b, or c.                      |
| \[a-c\[f-h]]    | Union, matches a, b, c, f, g, h.                  |
| \[a-c&&\[b-c]]  | Intersection, matches b or c.                     |
| \[a-c&&\[^b-c]] | Subtraction, matches a.                           |

## Predefined Character Classes

| .  | Any character.                           |
| -- | ---------------------------------------- |
| \d | A digit: \[0-9].                         |
| \D | A non-digit: \[^0-9].                    |
| \s | A whitespace character: \[\t\n\xOB\f\r]. |
| \S | A non-whitespace character: \[^\s].      |
| \w | A word character: \[a-zA-Z\_0-9].        |
| \W | A non-word character: \[^\w].            |

## Boundary Matches

| ^  | The beginning of a line.                                   |
| -- | ---------------------------------------------------------- |
| $  | The end of a line.                                         |
| \b | A word boundary.                                           |
| \B | A non-word boundary.                                       |
| \A | The beginning of the input.                                |
| \G | The end of the previous match.                             |
| \Z | The end of the input but for the final terminator, if any. |
| \z | The end of the input.                                      |

## Pattern Flags

| Pattern.CASE\_INSENSITIVE | Enables case-insensitive matching.                                           |
| ------------------------- | ---------------------------------------------------------------------------- |
| Pattern.COMMENTS          | Whitespace and comments starting with # are ignored until the end of a line. |
| Pattern.MULTILINE         | One expression can match multiple lines.                                     |
| Pattern.UNIX\_LINES       | Only the '\n' line terminator is recognized in the behavior of ., ^, and $.  |

## **Pattern**

A pattern is a compiler representation of a regular expression.

| Pattern compile(String regex)            | Compiles the given regular expression into a pattern.                      |
| ---------------------------------------- | -------------------------------------------------------------------------- |
| Pattern compile(String regex, int flags) | Compiles the given regular expression into a pattern with the given flags. |
| boolean matches(String regex)            | Tells whether or not this string matches the given regular expression.     |
| String\[] split(CharSequence input)      | Splits the given input sequence around matches of this pattern.            |
| String quote(String s)                   | Returns a literal pattern String for the specified String.                 |
| Predicate\<String> asPredicate()         | Creates a predicate which can be used to match a string.                   |

## **Matcher**

A Matcher is an engine that performs match operations on a character sequence by interpreting a pattern.

| boolean matches() | Attempts to match the entire region against the pattern.                              |
| ----------------- | ------------------------------------------------------------------------------------- |
| boolean find()    | Attempts to find the next subsequence of the input sequence that matches the pattern. |
| int start()       | Returns the start index of the previous match.                                        |
| int end()         | Returns the offset after the last character matched.                                  |

## Quantifiers

| Greedy     | Matches the longest matching group. |
| ---------- | ----------------------------------- |
| Reluctant  | Matches the shortest group.         |
| Possessive | Longest match or bust (no backoff). |

| X?     | X??     | X?+     | X, once or not at all.                   |
| ------ | ------- | ------- | ---------------------------------------- |
| X\*    | X\*?    | X\*+    | X, zero or more times.                   |
| X+     | X+?     | X++     | X, one or more times.                    |
| X{n}   | X{n}?   | X{n}+   | X, exactly n times.                      |
| X{n,}  | X{n,}?  | X{n,}+  | X, at least n times.                     |
| X{n,m} | X{n,m}? | X{n,m}+ | X, at least n but not more than m times. |

## Groups & Backreferences

A group is a captured subsequence of characters which may be used later in the expression with a backreference.

| (...)     | Defines a group.                                           |
| --------- | ---------------------------------------------------------- |
| \N        | Refers to a matched group.                                 |
| (\d\d)    | A group of two digits.                                     |
| (\d\d)/\1 | Two digits repeated twice. \1 Refers to the matched group. |

## Logical Operations

| XY   | X then Y |
| ---- | -------- |
| X\|Y | X or Y   |

## Date placeholders

The RegEx field in the File Event Listener supports specific date placeholders that will be dynamically replaced at runtime.

**Syntax**

`${date:<number of days before or after today's date>,format}`

* `today's date` is the offset value, and it is `0`. You can go back with `<-n>`, e.g., `${date:-4}` or forward with `<+n>`, e.g.,`${date:+1}`.\
  For example, if the offset date is 2025-04-10 and you enter `${date:-1}`, the resulting date would be 2025-04-09.
* `format` is the date and time format, which can be specified using the standard notation available at this link: [DateTimeFormatter (Java Platform SE 8)](https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html).\
  If no `format` is specified, the standard ISO8601 format will be used.

These placeholders are supported:

* **TODAY ISO format**\
  Format: `${date}` or `${date:0}`\
  Example: `^cme.${date}.s.pa2.zip$`
* **TODAY**\
  Format: `${date:0, format}`\
  Example: `^ME${date: 0, yyMMdd}.zip$`
* **YESTERDAY ISO format**\
  Format: `${date:-1}`\
  Example: `^IRM2.0_IM_ICUS_${date:-1}.zip`
* **YESTERDAY**\
  Format: `${date:-1, format}`\
  Example: `^OPT${date:-1, MMdd}F.SP6.zip$`
* **TOMORROW**\
  Format: `${date:+1, format}`\
  Example: `^OPT${date:+1, MMdd}F.SP6.zip$`
* **Custom PAST DAY**\
  A custom number of days in the past from today's date: `${date:-3}`\
  Example: `^OPT${date:-3, yyyyMMdd}F.SP6.zip$`
* **Custom FUTURE DAY**\
  A custom number of days in the future from today's date: `${date:+2}`\
  Example: `^OPT${date:+2, yyyyMMdd}F.SP6.zip$`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.primeur.com/data-mover-1.21/file-event-listener/regex-rules.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
