Regular Expressions
With Regular Expressions you can define complex search and filter expressions. All regular expressions are case insensitive, they are converted to lowercase internally.
Regular Expressions must be placed in one of the following functions:
- regex( ... )
Filters the given regular expression
For example: regex(\d+ downloads)
- StartToRegex( ... )
Filters everything from the page beginning to the first occurrence of the given Regular Expression
For example: StartToRegex(\d+ visitors)
- RegexToRegex( ... , ... )
Filters everything between two Regular Expressions
For example: RegexToRegex(Downloads\: \d+,License\:)
- RegexToEnd( ... )
Filters everything from the last occurrence of the given Regular Expression to the end of the page
For example: RegexToEnd(\d+ users online)
Below you can find a list of important and supported elements of Regular Expressions.
- \
The backslash escapes any character and can therefore be used to force characters to be matched as literals instead of being treated as characters with special meaning. For example, '\[' matches '[' and '\\' matches '\'.
- .
A dot matches any character. For example, 'go.d' matches 'gold' and 'good'.
- { }
{n} ... Match exactly n times
{n,} ... Match at least n times
{n,m} ... Match at least n but not more than m times
- [ ]
A string enclosed in square brackets matches any character in that string, but no others. For example, '[xyz]' matches only 'x', 'y', or 'z', a range of characters may be specified by two characters separated by '-'. Note that '[a-z]' matches alphabetic characters, while '[z-a]' never matches.
- [-]
A hyphen within the brackets signifies a range of characters. For example, [b-o] matches any character from b through o.
- |
A vertical bar matches either expression on either side of the vertical bar. For example, bar|car will match either bar or car.
- *
An asterisk after a string matches any number of occurrences of that string, including zero characters. For example, bo* matches: bo, boo and booo but not b.
- +
A plus sign after a string matches any number of occurrences of that string, except zero characters. For example, bo+ matches: boo, and booo, but not bo or be.
- \d+
matches all numbers with one or more digits
- \d*
matches all numbers with zero or more digits
- \w+
matches all words with one or more characters containing a-z, A-Z and 0-9. \w+ will find title, border, width etc. Please note that \w matches only numbers and characters (a-z, A-Z, 0-9) lower than ordinal value 128.
- [a-zA-Z\xA1-\xFF]+
matches all words with one or more characters containing a-z, A-Z and characters larger than ordinal value 161 (eg. ä or Ü). If you want to find words with numbers, then add 0-9 to the expression: [0-9a-zA-Z\xA1-\xFF]+
- regex(bo*)
will find "bo", "boo", "bot", but not "b"
- regex(bx+)
will find "bxxxxxxxx", "bxx", but not "bx" or "be"
- regex(\d+)
will find all numbers
- regex(\d+ visitors)
will find "3 visitors" or "243234 visitors" or "2763816 visitors"
- regex(\d+ of \d+ messages)
will find "2 of 1200 messages" or "1 of 10 messages"
- RegexToEnd(\d+ of \d+ messages)
will filter everything from the last occurrence of "2 of 1200 messages" or "1 of 10 messages" to the end of the page
- regex(MyText.{0,20})
will find "MyText" and the next 20 characters after "MyText"
- regex(\d\d.\d\d.\d\d\d\d)
will find date-strings with format 99.99.9999 or 99-99-9999 (the dot in the regex matches any character)
- regex(\d\d\.\d\d\.\d\d\d\d)
will find date-strings with format 99.99.9999
- regex(([_a-zA-Z\d\-\.]+@[_a-zA-Z\d\-]+(\.[_a-zA-Z\d\-]+)+))
will find all e-mail addresses