XQuery/缓存和索引
外观
< XQuery
需要使用索引补充有关各个团队或组的数据视图,这些索引适合这些视图的资源。按需生成索引是一种方法,但会加载 SPARQL 服务器。鉴于 DBpedia 提取的批处理性质,将索引数据缓存并使用缓存生成索引页面更有意义。(触发缓存刷新是另一个问题!)
以下脚本生成一个索引页面,其中包含指向艺术家专辑的 HTML 视图和时间线视图的链接。
declare option exist:serialize "method=xhtml media-type=text/html"; declare variable $query := " PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX p: <http://dbpedia.org/property/> SELECT * WHERE { ?group skos:subject <http://dbpedia.org/resource/Category:Rock_and_Roll_Hall_of_Fame_inductees>. } "; declare function local:clean($text) { let $text:= util:unescape-uri($text,"UTF-8") let $text := replace($text,"\(.*\)","") let $text := replace($text,"_"," ") return $text }; let $category := request:get-parameter("category","") let $categoryx := replace($category,"_"," ") let $queryx := replace($query,"Rock_and_Roll_Hall_of_Fame_inductees",$category) let $sparql := concat("http://dbpedia.org/sparql?default-graph-uri=",escape-uri("http://dbpedia.org",true()), "&query=",escape-uri($queryx,true()) ) let $result := doc($sparql) return <html> <body> <h1>{$categoryx}</h1> <table border="1"> { for $row in $result/table//tr[position()>1] let $resource := substring-after($row/td[1],"resource/") let $name := local:clean($resource) order by $name return <tr> <td> {$name} </td> <td> <a href="group2html.xq?group={$resource}">HTML</a> </td> <td> <a href="groupTimeline.xq?group={$resource}">Timeline</a> </td> </tr> } </table> </body> </html>
需要两个脚本 - 一个用于生成要缓存的数据,另一个用于生成索引页面。该方法以一个基于维基百科类别摇滚名人堂入选者的摇滚乐队索引为例进行说明。
此脚本生成一个 XML 文件。进一步的开发会将 XML 直接存储到数据库中,但也可以手动保存到相应的位置。它由一个类别参数化。
declare variable $query := " PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX p: <http://dbpedia.org/property/> SELECT * WHERE { ?group skos:subject <http://dbpedia.org/resource/Category:Rock_and_Roll_Hall_of_Fame_inductees>. } "; declare function local:clean($text) { let $text:= util:unescape-uri($text,"UTF-8") let $text := replace($text,"\(.*\)","") let $text := replace($text,"_"," ") return $text }; declare function local:table-to-seq($table ) { let $head := $table/tr[1] for $row in $table/tr[position()>1] return <tuple> { for $cell at $i in $row/td return element {$head/th[position()=$i]} {string($cell)} } </tuple> }; let $category := request:get-parameter("category","Rock_and_Roll_Hall_of_Fame_inductees") let $queryx := replace($query,"Rock_and_Roll_Hall_of_Fame_inductees",$category) let $sparql := concat("http://dbpedia.org/sparql?default-graph-uri=",escape-uri("http://dbpedia.org",true()), "&query=",escape-uri($query,true()) ) let $result := doc($sparql)/table let $groups := local:table-to-seq($result) return <ResourceList category="{$category}"> {for $group in $groups let $resource := substring-after($group/group,"resource/") let $name := local:clean($resource) order by $name return <resource id="{$resource}" name="{$name}"/> } </ResourceList>
注意:我想更好的方法是使用三元组,保存到本地三元组存储中。
此脚本,groupList,使用缓存的索引数据
declare option exist:serialize "method=xhtml media-type=text/html"; let $list := //ResourceList[@category="Rock_and_Roll_Hall_of_Fame_inductees"] return <html> <body> <h1>Rock Groups</h1> <table border="1"> {for $declare option exist:serialize "method=xhtml media-type=text/html"; lresource in $list/resource order by $resource/@name return <tr> <td> {string($resource/@name)} </td> <td> <a href="group2html.xq?group={$resource/@id}">HTML</a> </td> <td> <a href="groupTimeline.xq?group={$resource/@id}">Timeline</a> </td> </tr> } </table> </body> </html>