Below you can find a list of useful Regular Expression tokens. A detailed description and more tokens can be found on many places in the world wide web.
\
|
The backslash escapes any character and can therefore be used to force characters to be matched as literals instead of being treated as characters with special meaning. For example, '\[' matches '[' and '\\' matches '\'.
|
.
|
A dot matches any character. For example, 'go.d' matches 'gold' and 'good'.
|
{ }
|
{n} ... Match exactly n times
{n,} ... Match at least n times
{n,m} ... Match at least n but not more than m times
|
[ ]
|
A string enclosed in square brackets matches any character in that string, but no others. For example, '[xyz]' matches only 'x', 'y', or 'z', a range of characters may be specified by two characters separated by '-'. Note that '[a-z]' matches alphabetic characters, while '[z-a]' never matches.
|
[-]
|
A hyphen within the brackets signifies a range of characters. For example, [b-o] matches any character from b through o.
|
|
|
A vertical bar matches either expression on either side of the vertical bar. For example, bar|car will match either bar or car.
|
*
|
An asterisk after a string matches any number of occurrences of that string, including zero characters. For example, bo* matches: bo, boo and booo but not b.
|
+
|
A plus sign after a string matches any number of occurrences of that string, except zero characters. For example, bo+ matches: boo, and booo, but not bo or be.
|
\d+
|
matches all numbers with one or more digits
|
\d*
|
matches all numbers with zero or more digits
|
\w+
|
matches all words with one or more characters containing a-z, A-Z and 0-9. \w+ will find title, border, width etc. Please note that \w matches only numbers and characters (a-z, A-Z, 0-9) lower than ordinal value 128.
|
\s
|
matches a whitespace (space, tab and carriage return/line feed)
|
.*?
|
find as few characters as possible.
a.*?b means: "find "a", followed by as few characters as possible, followed by "b
|
[a-zA-Z\xA1-\xFF]+
|
matches all words with one or more characters containing a-z, A-Z and characters larger than ordinal value 161 (eg. ä or Ü). If you want to find words with numbers, then add 0-9 to the expression: [0-9a-zA-Z\xA1-\xFF]+
|
will find "b", "bo", "boo", "booooo"
will find "bxxxxxxxx", "bxx", "bx" but not "b"
will find all numbers
will find "3 visitors" or "243234 visitors" or "2763816 visitors"
• | regex(\d+ of \d+ messages) |
will find "2 of 1200 messages" or "1 of 10 messages"
• | RegexToEnd(\d+ of \d+ messages) |
will filter everything from the last occurrence of "2 of 1200 messages" or "1 of 10 messages" to the end of the page
will find "MyText" and the next 20 characters after "MyText"
• | regex(\d\d.\d\d.\d\d\d\d) |
will find date-strings with format 99.99.9999 or 99-99-9999 (the dot in the regex matches any character)
• | regex(\d\d\.\d\d\.\d\d\d\d) |
will find date-strings with format 99.99.9999
• | regex(([_a-zA-Z\d\-\.]+@[_a-zA-Z\d\-]+(\.[_a-zA-Z\d\-]+)+)) |
will find all e-mail addresses
|