XPathLink

Abstract: XPathLink is a query language for XML documents linked with XLink links. XPathLink is based on XPath and extends it with transparent XLink support. The implementation of XPathLink in Scheme is provided.

1  Introduction

XLink is a language for describing links between resources using XML attributes and namespaces. XLink provides expressive means for linking information in different XML documents. With XLink, the practical XML application data can be expressed as several linked XML documents, rather than a single complicated XML document. Such a design makes it very attractive to have a query language that would inherently recognize XLink links and provide a natural navigation mechanism over them. Me and Kirill Lisovsky designed and implemented such a query language. This language is an extension to XPath and its implementation is based on SXML and is an extended SXPath. We call it SXPath with XLink support, or SXPathLink.

Additionally, an HTML <A> hyperlink can be considered as a particular case of an XLink link. This observation makes it possible to query HTML documents with SXPathLink as well. Neil W. Van Dyke <neil@neilvandyke.org> and his permissive HTML parser HtmlPrag have made this feature possible.

The rest of the page is organized as follows. Section 2 gives an overview of the proposed extension to XPath with XLink support. SXPathLink is available for download in Section 3.

2  Extension to XPath

Me and Kirill Lisovsky propose the extension to XPath that allows quering XML documents that contain XLink links, and navigating XLink links in a natural way. The proposed extension to XPath has the following features:

The additional axes to XPath – our proposed XPath extension – are discussed below.

2.1  Traverse:: axis

The term “traverse” is introduced in the XLink specification and denotes following an XLink arc from its starting resource to its ending resource. The notion of traverse in XLink closely corresponds to the semantics given in the XPath specification to the axis.

Suppose that we have some nodes A and B (probably, in different XML documents) and for an XLink arc node A is the starting resource and node B is the ending resource. Suppose that we evaluate an XPath expression and A is a context node. Then, the traverse:: axis would select the node B.

More formally, the description of traverse:: axis can be given as follows:

Description 1   The traverse:: axis contains the nodes that can be traversed from the context node with XLink arcs. That is, for all arcs that start from the context node, the traverse:: axis returns nodes that are the ending resource for these arcs.

Even this one additional axis allows making expressive queries to (S)XML documents that contain XLink links. Here some use cases are considered that illustrate the practical application of the traverse:: axis.

The XPathLink packages available in Sect. 3 for download, contain the implementation of the use cases considered.

2.2  Arc:: and traverse-arc:: axes

The traverse:: axis discussed in the previous subsection can be sufficient for most practical tasks of quering linked documents.

However, for more sophisticated tasks of document processing, you may wish to make queries to arc themselves. This functionality has been achieved by means of two design principles taken:

Description 2   The arc:: axis contains the SXLink arcs that start from the context node. Each SXLink arc is a well-formed SXML node (and thus can be processed by further SXPath location steps).

An SXLink arc contains all the information about the arc:

– and thus can be considered as a view for an XLink arc in the form of SXML. Here you can view the SXLink specification which formally defines the grammar for an SXLink arc and illustrates in on a variety of examples.

From the point of the XPath data model, an SXLink arc can be considered as an additional (8-th) type of node. An SXLink arc can be accessed by the arc:: axis exclusively, similarily to an attribute node which can be accessed by the attribute:: axis exclusively.

  Selecting SXLink arcs by means of the arc:: axis and probably filtering among several SXLink arcs by means of conventional XPath predicates, you may wish to traverse the arc(s) you selected to their ending resources. The traverse-arc:: axis is introduced to allow you achieving this:

Description 3   The traverse-arc:: axis contains the ending resource for the context node that is the SXLink arc or any of its descendant SXML nodes. For a different context node, the traverse-arc:: axis selects the empty node-list.

The following equation holds:

traverse::NodeTest = arc::*/traverse-arc::NodeTest

in that both location paths select equal node-lists.

3  Download

XPathLink is included into SSAX-SXML package.
To download the package, go to the Download page.


Back to XML-Functional page


This document was translated from LATEX by HEVEA.