Rewrite variables can be used in regular expression patterns to define rewrite operations that are independent of the specific mapping strings..
Rewrite variables
Regex syntax
Airlock Gateway uses Perl Compatible Regular Expressions (PCRE).
Basic operators
ab|xy Alternation, either 'ab' or 'xy' (str) The string 'str' as a logical and captured group (?:str) The string 'str' as a logical uncaptured group a? The letter 'a' once or not at all a* The letter 'a' in unrestricted quantity (including zero) a+ The letter 'a' at least once [abc] Character class: a, b or c [^abc] Negative character class: any character except a, b or c [a-c] Range: a, b or c ^ The beginning of the string $ The end of the string
Characters
The backslash character '\'
is used for escaping.
An ASCII characters typically gets a special meaning if it is preceded by a backslash '\'
.
. Any character (including CR or LF) \x{hhh..} Character with unicode codepoint U+hhh.. (1 to 6 hex digits) \n Line feed character (U+000A) \r Carriage return character (U+000D) \t Tab character (U+0009)
Airlock Gateway does not allow to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language.
Newlines are treated as ordinary characters. They do not have any special meaning in the processed string.
Escaping
A non-ASCII character loses its special meaning if it is preceded by a backslash.
A left parenthesis '('
is the start of a group. If it is escaped '\('
, it matches just a left parenthesis.
\\ the backslash character \? Escaped character (for any non-alphanumeric character) \Q .. \E Literal-text span: treat enclosed characters as literal until the first appearance of \E (no escaping possible)
A backslash may be used prior to any non-alphabetic character regardless of whether the character has a special meaning in that context or not.
Escaping is normally needed for these characters: [({.*?+^$\|
Depending on the context right parenthesis also have to escaped: ])
In brackets escaping is only needed for the right bracket character (but others are allowed as well): ]
Generic characters types
ASCII character types
\d Any ASCII decimal digit - equals [0123456789] \D Any character that is not an ASCII decimal digit \s Any ASCII white space character - equals to ' ', HT, LF, FF, CR \S Any character that is not an ASCII whitespace character \w Any ASCII "word" character [a-zA-Z0-9_] \W Any "non-word" character
non-ASCII character types
\h Any horizontal white space character (including non-ASCII U+2000, U+00a0, U+180e, ...) \H Any character that is not a horizontal whitespace character \v Any vertical white space character (including non-ASCII U+2028, U+0085) \V Any character that is not a vertical white space character
Comments
(?#...) comment (not nestable)
Simple examples
Pattern | Matches |
---|---|
| 'http' or 'https' |
| 'abc' or 'xyz'. The found variant can be referenced as $1 in rewrite rules (capturing) |
| 'this' or 'that'. The value is not captured and cannot be referenced in rewrite rules |
| Any number of digits (at least one) - but just digits. |
| Any string containing at least one digit. This could be achieved also by just using \d |
| Just the dollar sign - escaping is supported in bracket expressions |
| Backslash or dollar sign |
| Any character but a tabulator |
| The non-breaking-space character |
More documentation
There are more detailed explications and examples on the Regex Expert page.
For an even more detailed documentation please read the extensive original PRCE man pages.