Regular Expressions

A regular expression, regex or regexp (sometimes called a rational expression) is, in theoretical computer science and formal language theory, a sequence of characters that define a search pattern. Usually this pattern is then used by string searching algorithms for “find” or “find and replace” operations on strings, or for input validation.

Source: Wikipedia

^A circumflex at the start of the string matches the start of a line.
$A dollar sign at the end of the expression matches the end of a line.
.A period matches a single instance of any character. For example, b.t matches bot, and bat, but not boat.
?A question mark after a character or a character group matches zero or one occurrences of that character or group. For example, bo?t matches both bt and bot.
*An asterisk after a character or a character group matches any number of occurrences of that character or group, including zero occurrences. For example, bo*t matches bt, bot, and boot.
+A plus sign after a character or a character group matches any number of occurrences of that character or a character group, with at least one occurrence. For example, bo+t matches bot and boot, but not bt.
|A vertical bar matches each expression on each side of the vertical bar. For example, bar|car will match either bar or car.
[ ]Characters inside square brackets match any character that appears in the brackets, but no others. For example, [bot] matches b, o, or t.
[^]A circumflex at the start of a string inside square brackets means NOT. Hence, [^bot] matches any characters except b, o, or t.
[-]A hyphen inside square brackets signifies a range of characters.

For example, [b-o] matches any character from b through o.

{ }Braces group characters or expressions. Groups can be nested, with a maximum number of 10 groups in a single pattern. For the Replace operation, groups are referred to by a backslash and a number, according to the position in the "Text to find" expression, beginning with 0. For example, given the text to find and replacement strings, Find: {[0-9]}{[a-c]*}, Replace: NUM\1, the string 3abcabc is changed to NUMabcabc.
( )Parenthesis are an alternative to braces ({ }), with the same behavior.
\A backslash before a wildcard character tells the Code Editor to treat that character literally, not as a wildcard. For example, \^ matches ^ and does not look for the start of a line.

For more technologies supported by our ETL Software see Advanced ETL Processor Versions

Confused? Ask question on our ETL Forum
Last updated: September 24, 2022