Merge pages in forum thread

Top  Previous  Next

This example shows how to follow to the next page in a forum thread once it is available.

 

The links to the previous and next pages are typically displayed as:

  First ... 131415 ... Last  

 

If we are for example on page 14 and a link to page 15 already exists in the page, then this plugin will return the new URL so that WebSite-Watcher can merge the new page with the current page.

 

Links of the pages have the following format:

  http://domain.com/forum/topic123x-14.html

where 14 is the page number. The URL of the first page doesn't have the page number included, just topic123x.html

 

Each posting in that forum thread has a posting number that is displayed as # 12 for posting number 12. We will define a watch filter to monitor only the post numbers, then we will not get false positives if other content is changed, for example forum statistics, a changing date, advertisements, etc.

 

 

Sub Wsw_BeforeCheck()

   ' Set a watch filter to monitor only posting numbers

   ' so we will only get an update notification if a new posting number appears

   Bookmark_SetProperty("watch_filter", "regex(#\s*\d+)")

   Bookmark_SetProperty("filter_ignore_removed_content", "1")

End Sub

 

'*******************************************************************************

 

Sub Wsw_MergePages($sMem, $nPageNumber, $sUrl, ByRef $sNewUrl, ByRef $sNewPostData, ByRef $bChangeBookmarkUrl, ByRef $sStatusMessage, ByRef $iStatusCode)

   

   Dim $nPageParamNumber, $sFilename

   

   ' Limit the number of merged pages

   If $nPageNumber > 3 Then

      Return

   End If

   

   ' Extract page number

   $sFilename = GetFirstRegexMatch($sUrl, "index\d+\.html")

   If $sFilename = "" Then

      $nPageParamNumber = 2

   Else

      $sUrl = Replace($sUrl, $sFilename, "")

      $nPageParamNumber = CInt(ExtractDigits($sFilename)) + 1

   End If

   $sFilename = "index" + CStr($nPageParamNumber) + ".html"

   

   ' Generate possible next page

   If Right($sUrl, 1) <> "/" Then

      $sUrl = $sUrl + "/"

   End If

   $sUrl = $sUrl + $sFilename

   

   ' check, if the next page is already available in the page source

   If Pos($sUrl, $sMem) > 0 Then

      ' the next page already exists

      $sStatusMessage = "New page via Merge-Plugin"

      $sNewUrl = $sUrl 'return new URL

      $bChangeBookmarkUrl = True 'change URL in bookmark properties

   End If

End Sub

 

 

If a forum thread displays the page number via a parameter in the URL (for example &page=13), then the following code might be a starting point:

 

Sub Wsw_BeforeCheck()

   ' optional watch filter if each post has a posting number, eg. #14

   Bookmark_SetProperty("watch_filter", "regex(#\s*\d+)")

   Bookmark_SetProperty("filter_ignore_removed_content", "1")

End Sub

 

'*******************************************************************************

 

Sub Wsw_MergePages($sMem, $nPageNumber, $sUrl, ByRef $sNewUrl, ByRef $sNewPostData, ByRef $bChangeBookmarkUrl, ByRef $sStatusMessage, ByRef $iStatusCode)

   

   Dim $nPageParamNumber, $sPageParam

   

   ' Limit the number of merged pages

   If $nPageNumber > 5 Then

      Return

   End If

   

   ' Extract page number

   $sPageParam = GetFirstRegexMatch($sUrl, "\&page=\d+")

   If $sPageParam = "" Then

      ' first page, continue with page 2

      $sUrl = $sUrl + "&page=2"

   Else

      ' increase page number

      $nPageParamNumber = CInt(ExtractDigits($sPageParam)) + 1

      $sUrl = Replace($sUrl, $sPageParam, "&page=" + CStr($nPageParamNumber))

   End If

   

   ' check, if the next page is already available in the page source

   If Pos($sUrl, $sMem) > 0 Then

      ' the next page already exists

      $sNewUrl = $sUrl

      $bChangeBookmarkUrl = True ' change bookmark URL to latest merged URL

      $sStatusMessage = "New page via FOLLOW-Plugin"

   End If

   

End Sub