跳到正文

Rebol 编程/语言功能/递归/wikichanges

来自维基教科书,面向开放世界的开放书籍

以下是如何使用递归函数爬取一本维基教科书的示例

REBOL [
   file: %wikichanges.r
   author: "Graham"
   date: 18-Sep-2005
   rights: 'BSD
   purpose: {
       To display the changes in a wikibook in reverse date order using a recursive function
   }
   
]

home: https://wikibooks.cn/wiki/REBOL_Programming
relative-root: "/wiki/REBOL_Programming/"
root: https://wikibooks.cn/

pages: []
updates: []

rebuild-date: func [ 
   { builds a date value from the update stamp on a wikipage }
   date [string!] 
   /local d
][
   d: load date
   to-date rejoin [ d/2 "-" copy/part form d/3 3 "-" d/4 "/" d/1 ]
]

get-link: func [ 
   { recursive function to get the links from a wikipage }
   page [url!] 
   /local tags internal-link content
][
   wait .5 ; don't overload the wikibook server with too many requests at once
   print [ "loading ... " page ]    
   content: read page
   if parse find/last/tail content "This page was last modified" [ copy updated to "." to end ][
       print [ "Updated: " updated ]
       repend updates [ rebuild-date updated page ]
   ]
   tags: load/markup content
   foreach tag tags [
       if parse tag  [ to relative-root skip copy internal to {"} to "title=" to end ][
           if not find pages internal-link: join root trim internal [
               append pages internal-link
               get-link internal-link
           ]
       ]
   ]    
]

append pages home

; grab all the links
get-link home

updates: sort/skip/reverse updates 2

; print them out
foreach [ date page ] updates [ print [ date page ]]

它会产生此类输出

17-Sep-2005/23:13 https://wikibooks.cn/wiki/REBOL_Programming/Language_Features/VID
17-Sep-2005/5:18 https://wikibooks.cn/wiki/REBOL_Programming/Language_Features/Control
16-Sep-2005/18:41 https://wikibooks.cn/wiki/REBOL_Programming/Third_Party
...
华夏公益教科书