Create Ignore / Watch filter manually

Top  Previous  Next

Alternatively to the Auto-Filter system and the Filter-Assistant, you can create all Ignore/Watch filters manually.

 

Enter one filter expression per line. All filter definitions are case insensitive, they are converted to lowercase internally.

 

filter-assistant08

 

The button "New Ignore Filter" (or "New Watch Filter") opens a helper dialog where you can enter and validate a new filter expression. The button "Test selected line" lets you validate the filter expression in the active line. The "Test filter" button tests all filter definitions by comparing the filtered text content of the new page with the filtered text content of the old page.

 

Each line can contain one of the following elements:

 

1.Static text phrase
You can use that type if you want to ignore specific phrases in a page.
 
2.Text with a Wildcard
A wildcard filter is typically used to ignore areas of a page with a specified start/end text.
 
3.Regular Expression
With regular expressions you can define complex filter expressions which must be placed in special function names, for example "regex(....)".

Wildcard filter

WebSite-Watcher supports three types of wildcard filters with the restriction that only one Wildcard is allowed per filter expression:

 

*EndText

This form filters everything from the page beginning to the first occurrence "EndText".

For example: *Daily News

 

StartText*EndText

This form filters all text areas which begin with "StartText" and end with "EndText".

For example: Downloads:*Publisher

 

StartText*

This form filters everything from the last occurrence of "StartText" to the end of the page.

For example: Users online*

Regular Expressions

WebSite-Watcher supports PERL5 compatible regular expressions which can be used to create complex filter definitions. Regular Expressions must be placed in one of the following functions:

 

regex( ... )

Filters the defined regular expression

For example: regex(\d+ downloads)

 

FirstRegex( ... )

Filters only the first occurrence of the defined regular expression

For example: FirstRegex(\d+ downloads)

 

StartToRegex( ... )

Filters everything from the page beginning to the first occurrence of the defined regular expression

For example: StartToRegex(\d+ visitors)

 

RegexToRegex( ... , ... )

Filters everything between two regular expressions

For example: RegexToRegex(Downloads\: \d+,License\:)

 

RegexToEnd( ... )

Filters everything from the last occurrence of the defined regular expression to the end of the page

For example: RegexToEnd(\d+ users online)

 

RegexCmp( ... )

Finds a defined regular expression, extracts all digits from the result and compares them with a pre-defined number. This can for example be used to extract and compare prices. Eg. to only find a match when a certain price is higher than 1000.

For example: RegexCmp(\d+([,\.]\d+)* Euro;,; > 1000)

CSS based filters

Page content can also be ignored or watched by CSS class names.

 

css( CLASSNAME )

Ignores/Watches all text content that is formatted with the defined CSS class name.

The order of filters is important!

Filters are always executed from top to bottom, that means that the filter in the first line is executed before the filter in the second line, and so on.

 

The content that is filtered with the first filter will then no longer be available for filter definitions in the following lines.

 

Example:

 

You have defined the following two ignore filters:

 

   Watcher

   WebSite-Watcher

 

The first filter ignores (deletes) all words "Watcher". The second filter will never find a match since "Watcher" is no longer available. The correct order for these two filters would be:

 

   WebSite-Watcher

   Watcher

 

Here the first filter ignores (deletes) all words "WebSite-Watcher". The second filter can then ignore all remaining words "Watcher".

Outdated filters

Outdated ignore filters are automatically detected and deleted after some time when they no longer find any matches. There's no need to delete ignore filters manually. That behavior can be disabled in the Tweaks section (although we do not recommend it).

 

Outdated watch filters are not deleted automatically, you have to maintain these kind of filters manually.

Don't forget to test your filter settings

The feature "Test filter" lets you always verify your filter definitions by comparing the filtered text content of the new page with the filtered text content of the old page.

 

 

1.Open the bookmark properties
2.Click the "Filter-Assistant" button (or the text "Manual text filter")
3.Enter the filter definitions

 

Related topics

Wildcards
Regular Expressions

 

 




Translate document: