XPathLink |
Abstract: XPathLink is a query language for XML documents linked with XLink links. XPathLink is based on XPath and extends it with transparent XLink support. The implementation of XPathLink in Scheme is provided.
XLink is a language for describing links between resources using XML attributes and namespaces. XLink provides expressive means for linking information in different XML documents. With XLink, the practical XML application data can be expressed as several linked XML documents, rather than a single complicated XML document. Such a design makes it very attractive to have a query language that would inherently recognize XLink links and provide a natural navigation mechanism over them. Me and Kirill Lisovsky designed and implemented such a query language. This language is an extension to XPath and its implementation is based on SXML and is an extended SXPath. We call it SXPath with XLink support, or SXPathLink.
Additionally, an HTML <A> hyperlink can be considered as a particular
case of an XLink link.
This observation makes it possible to query HTML documents with SXPathLink as
well.
Neil W. Van Dyke <neil@neilvandyke.org> and his permissive HTML
parser HtmlPrag have made this feature possible.
The rest of the page is organized as follows. Section 2 gives an overview of the proposed extension to XPath with XLink support. SXPathLink is available for download in Section 3.
Me and Kirill Lisovsky propose the extension to XPath that allows quering XML documents that contain XLink links, and navigating XLink links in a natural way. The proposed extension to XPath has the following features:
The additional axes to XPath – our proposed XPath extension – are discussed below.
The term “traverse” is introduced in the XLink specification and denotes following an XLink arc from its starting resource to its ending resource. The notion of traverse in XLink closely corresponds to the semantics given in the XPath specification to the axis.
Suppose that we have some nodes A and B (probably, in different XML documents) and for an XLink arc node A is the starting resource and node B is the ending resource. Suppose that we evaluate an XPath expression and A is a context node. Then, the traverse:: axis would select the node B.
More formally, the description of traverse:: axis can be given as follows:
Even this one additional axis allows making expressive queries to (S)XML documents that contain XLink links. Here some use cases are considered that illustrate the practical application of the traverse:: axis.
The XPathLink packages available in Sect. 3 for download, contain the implementation of the use cases considered.
The traverse:: axis discussed in the previous subsection can be sufficient for most practical tasks of quering linked documents.
However, for more sophisticated tasks of document processing, you may wish to make queries to arc themselves. This functionality has been achieved by means of two design principles taken:
An SXLink arc contains all the information about the arc:
– and thus can be considered as a view for an XLink arc in the form of SXML. Here you can view the SXLink specification which formally defines the grammar for an SXLink arc and illustrates in on a variety of examples.
From the point of the XPath data model, an SXLink arc can be considered as an additional (8-th) type of node. An SXLink arc can be accessed by the arc:: axis exclusively, similarily to an attribute node which can be accessed by the attribute:: axis exclusively.
Selecting SXLink arcs by means of the arc:: axis and probably filtering among several SXLink arcs by means of conventional XPath predicates, you may wish to traverse the arc(s) you selected to their ending resources. The traverse-arc:: axis is introduced to allow you achieving this:
The following equation holds:
traverse::NodeTest = arc::*/traverse-arc::NodeTest
in that both location paths select equal node-lists.
XPathLink is included into SSAX-SXML package.
To download the package, go to the Download page.
This document was translated from LATEX by HEVEA.