XQuery/XQuery 和 Python

在 [1]，Cameron Laird 举例展示了一个 Python 代码，用于从 XHTML 页面上提取和列出 a 标签

    import elementtree.ElementTree
    
    for element in  elementtree.ElementTree.parse("draft2.xml").findall("//a"):
        if element.tag == "a":
            attributes = element.attrib
            if "href" in attributes:
                print "'%s' is at URL '%s'." % (element.text,
                                                attributes['href'])
            if "name" in attributes:
                print "'%s' anchors '%s'." % (element.text,
                                                attributes['name'])

此 Python 代码在 XQuery 中的等效代码如下

 for $a in doc("http://en.wikipedia.org/wiki/XQuery")//*:a
 return   
   if ($a/@href)
   then concat("'", $a,"'  is at  URL '",$a/@href,"'&#10;")
   else if ($a/@name)
   then concat("'", $a,"'  anchors '",$a/@name,"'&#10;")
   else ()

维基百科 XQuery 中的 tags（查看源代码）

此处的命名空间前缀为通配符，因为我们不知道 html 命名空间是什么。

更简洁，但可读性稍低（并且为提高明确性而省略了输出中的引号），可以表示为

 string-join(
      doc("http://en.wikipedia.org/wiki/XQuery")//*:a/
        (if (@href)
        then concat(.,"  is at  URL ",@href)
        else if (@name)
        then concat(.," anchors ", @name)
        else ()
        )
         ,'&#10;'
     )

维基百科 XQuery 中的 tags（查看源代码）

更为实用的是，我们可能会将任何 XHTML 页面的 url 作为参数提供，并生成一个包含外部链接的 HTML 页面

declare option exist:serialize "method=xhtml media-type=text/html";

let $url :=request:get-parameter("url",())
return
  <html>
      <h1>External links in {$url}</h1>
       { 
        for $a in doc($url)//*:a[text()][starts-with(@href,'http://')]
        return 
               <div><b>{string($a)}</b> is at  <a href="{$a/@href}"><i>{string($a/@href)}</i> </a></div>
       }
  </html>

维基百科 XQuery