On the index page, the primary high-level API functions are presented:
procedure sxml:document :: REQ-URI [NAMESPACE-PREFIX-ASSIG] ->
-> SXML-TREE
Obtain a [possibly, remote] document by its URI
Supported URI formats: local file and HTTP schema
Supported document formats: XML and HTML
REQ-URI - a string that contains the URI of the requested document
NAMESPACE-PREFIX-ASSIG - is passed as-is to the SSAX parser: there it is
used for assigning certain user prefixes to certain namespaces.
NAMESPACE-PREFIX-ASSIG is an optional argument and has an effect for an
XML resource only. For an HTML resource requested, NAMESPACE-PREFIX-ASSIG
is silently ignored.
Result: the SXML representation for the requested document
procedure: ssax:xml->sxml PORT NAMESPACE-PREFIX-ASSIG This is an instance of a SSAX parser that returns an SXML representation of the XML document to be read from PORT. NAMESPACE-PREFIX-ASSIG is a list of (USER-PREFIX . URI-STRING) that assigns USER-PREFIXes to certain namespaces identified by particular URI-STRINGs. It may be an empty list. The procedure returns an SXML tree. The port points out to the first character after the root element.
Opens an input port for a resource REQ-URI - a string representing a URI of the resource An input port is returned if there were no errors. In case of an error, the function returns #f and displays an error message as a side effect. Doesn't raise any exceptions.
xpath-string - an XPath location path (a string)
ns-binding - declared namespace prefixes (an optional argument)
ns-binding = (list (prefix . uri)
(prefix . uri)
...)
prefix - a symbol
uri - a string
The returned result: (lambda (node . var-binding) ...)
or #f
#f - signals of a parse error (error message is printed as a side effect
during parsing)
(lambda (node . var-binding) ...) - an SXPath function
node - a node (or a node-set) of the SXML document
var-binding - XPath variable bindings (an optional argument)
var-binding = (list (var-name . value)
(var-name . value)
...)
var-name - (a symbol) a name of a variable
value - its value. The value can have the following type: boolean, number,
string, nodeset. NOTE: a node must be represented as a singleton nodeset
Administrative SXPath variables:
*root* - if presented in the 'var-binding', its value (a node or a nodeset)
specifies the root of the SXML document
Evaluate an abbreviated SXPath
sxpath:: AbbrPath -> Converter, or
sxpath:: AbbrPath -> Node|Nodeset -> Nodeset
AbbrPath is a list. It is translated to the full SXPath according
to the following rewriting rules
(sxpath '()) -> (node-join)
(sxpath '(path-component ...)) ->
(node-join (sxpath1 path-component) (sxpath '(...)))
(sxpath1 '//) -> (sxml:descendant-or-self sxml:node?)
(sxpath1 '(equal? x)) -> (select-kids (node-equal? x))
(sxpath1 '(eq? x)) -> (select-kids (node-eq? x))
(sxpath1 '(*or* ...)) -> (select-kids (ntype-names??
(cdr '(*or* ...))))
(sxpath1 '(*not* ...)) -> (select-kids (sxml:complement
(ntype-names??
(cdr '(*not* ...)))))
(sxpath1 '(ns-id:* x)) -> (select-kids
(ntype-namespace-id?? x))
(sxpath1 ?symbol) -> (select-kids (ntype?? ?symbol))
(sxpath1 ?string) -> (txpath ?string)
(sxpath1 procedure) -> procedure
(sxpath1 '(?symbol ...)) -> (sxpath1 '((?symbol) ...))
(sxpath1 '(path reducer ...)) ->
(node-reduce (sxpath path) (sxpathr reducer) ...)
(sxpathr number) -> (node-pos number)
(sxpathr path-filter) -> (filter (sxpath path-filter))
Generates an stx:stylesheet from a stylesheet represented as <stx-tree> in SXML format
transformate given SXML document <doc> using stylesheet <sst> in SXML format
procedure: pre-post-order TREE BINDINGS
Traversal of an SXML tree or a grove:
a <Node> or a <Nodelist>
A <Node> and a <Nodelist> are mutually-recursive datatypes that
underlie the SXML tree:
<Node> ::= (name . <Nodelist>) | "text string"
An (ordered) set of nodes is just a list of the constituent nodes:
<Nodelist> ::= (<Node> ...)
Nodelists, and Nodes other than text strings are both lists. A
<Nodelist> however is either an empty list, or a list whose head is
not a symbol (an atom in general). A symbol at the head of a node is
either an XML name (in which case it's a tag of an XML element), or
an administrative name such as '@'.
See SXPath.scm and SSAX.scm for more information on SXML.
Pre-Post-order traversal of a tree and creation of a new tree:
pre-post-order:: <tree> x <bindings> -> <new-tree>
where
<bindings> ::= (<binding> ...)
<binding> ::= (<trigger-symbol> *preorder* . <handler>) |
(<trigger-symbol> *macro* . <handler>) |
(<trigger-symbol> <new-bindings> . <handler>) |
(<trigger-symbol> . <handler>)
<trigger-symbol> ::= XMLname | *text* | *default*
<handler> :: <trigger-symbol> x [<tree>] -> <new-tree>
The pre-post-order function visits the nodes and nodelists
pre-post-order (depth-first). For each <Node> of the form (name
<Node> ...) it looks up an association with the given 'name' among
its <bindings>. If failed, pre-post-order tries to locate a
*default* binding. It's an error if the latter attempt fails as
well. Having found a binding, the pre-post-order function first
checks to see if the binding is of the form
(<trigger-symbol> *preorder* . <handler>)
If it is, the handler is 'applied' to the current node. Otherwise,
the pre-post-order function first calls itself recursively for each
child of the current node, with <new-bindings> prepended to the
<bindings> in effect. The result of these calls is passed to the
<handler> (along with the head of the current <Node>). To be more
precise, the handler is _applied_ to the head of the current node
and its processed children. The result of the handler, which should
also be a <tree>, replaces the current <Node>. If the current <Node>
is a text string or other atom, a special binding with a symbol
*text* is looked up.
A binding can also be of a form
(<trigger-symbol> *macro* . <handler>)
This is equivalent to *preorder* described above. However, the result
is re-processed again, with the current stylesheet.
procedure xlink:documents :: {REQ-URI}+ -> (listof SXML-TREE)
procedure xlink:documents-embed :: {REQ-URI}+ -> (listof SXML-TREE)
Both `xlink:documents' and `xlink:documents-embed' accept one or more
strings as their arguments. Each string supplied denotes the URI of the
requested document to be loaded. The requested document(s) are loaded
and are represented in SXML. All XLink links declared in these document(s)
are represented as a set of SXLink arcs. If any XLink links refer to XLink
linkbases [<a href="http://www.w3.org/TR/xlink/#xlg">XLink</a>],
these linkbases are additionally loaded, for additional SXLink arcs
declared there.
The starting resource for each SXLink arc is determined:
1. For each SXML document loaded, the function `xlink:document' adds all
SXLink arcs whose starting resource is located within this document, to
the auxiliary list of its document node (*TOP*).
2. The function 'xlink:documents-embed' embeds each SXLink arc into its
starting resource-node, via auxiliary list of that node. For text nodes
serving for starting resources, their SXLink arcs are stored in the
auxiliary list of the document node (*TOP*), since SXML text nodes do
not support their own auxiliary lists.
Supported URI formats:
+ local file
+ http:// schema
Supported document formats: XML and HTML. In the case of HTML,
<A> hyperlinks are considered as XLink simple links.
Result: (listof SXML-TREE)
A particular SXML document can be located in this list using the
function `xlink:find-doc'.
xpath-string - an XPath location path (a string) ns+na - can contain 'ns-binding' and/or 'num-ancestors' and/or none of them ns-binding - declared namespace prefixes (an optional argument) ns-binding ::= (listof (prefix . uri)) prefix - a symbol uri - a string num-ancestors - number of ancestors required for resulting nodeset. Can generally be omitted and is than defaulted to 0, which denotes a _usual_ nodeset. If a negative number, this signals that all ancestors should be remembered in the context Returns: (lambda (nodeset position+size var-binding) ...) position+size - the same to what was called 'context' in TXPath-1 var-binding - XPath variable bindings (an optional argument) var-binding = (listof (var-name . value)) var-name - (a symbol) a name of a variable value - its value. The value can have the following type: boolean, number, string, nodeset. NOTE: a node must be represented as a singleton nodeset
update-specifiers ::= (listof update-specifier)
update-specifier ::= (list xpath-location-path action [action-parametes])
xpath-location-path - addresses the node(s) to be transformed, in the form of
XPath location path. If the location path is absolute, it addresses the
node(s) with respect to the root of the document being transformed. If the
location path is relative, it addresses the node(s) with respect to the
node selected by the previous update-specifier. The location path in the
first update-specifier always addresses the node(s) with respect to the
root of the document. We'll further refer to the node with respect of which
the location path is evaluated as to the base-node for this location path.
action - specifies the modification to be made over each of the node(s)
addressed by the location path. Possible actions are described below.
action-parameters - additional parameters supplied for the action. The number
of parameters and their semantics depend on the definite action.
action ::= 'delete | 'delete-undeep |
'insert-into | 'insert-following | 'insert-preceding |
'replace |
'move-into | 'move-following | 'move-preceding |
handler
'delete - deletes the node. Expects no action-parameters
'delete-undeep - deletes the node, but keeps all its content (which thus
moves to one level upwards in the document tree). Expects no
action-parameters
'insert-into - inserts the new node(s) as the last children of the given
node. The new node(s) are specified in SXML as action-parameters
'insert-following, 'insert-preceding - inserts the new node(s) after (before)
the given node. Action-parameters are the same as for 'insert-into
'replace - replaces the given node with the new node(s). Action-parameters
are the same as for 'insert-into
'rename - renames the given node. The node to be renamed must be a pair (i.e.
not a text node). A single action-parameter is expected, which is to be
a Scheme symbol to specify the new name of the given node
'move-into - moves the given node to a new location. The single
action-parameter is the location path, which addresses the new location
with respect to the given node as the base node. The given node becomes
the last child of the node selected by the parameter location path.
'move-following, 'move-preceding - the given node is moved to the location
respectively after (before) the node selected by the parameter location
path
handler ::= (lambda (node context base-node) ...)
handler - specifies the required transformation. It is an arbitrary lambda
that consumes the node and its context (the latter can be used for addressing
the other node of the source document relative to the given node). The hander
can return one of the following 2 things: a node or a nodeset.
1. If a node is returned, than it replaces the source node in the result
document
2. If a nodeset is returned, than the source node is replaced by (multiple)
nodes from this nodeset, in the same order in which they appear in the
nodeset. In particular, if the empty nodeset is returned by the handler, the
source node is removed from the result document and nothing is inserted
instead.
Returns either (lambda (doc) ...) or #f
The latter signals of an error, an the error message is printed into stderr
as a side effect. In the former case, the lambda can be applied to an SXML
document and produces the new SXML document being the result of the
modification specified.
A highest-level function
procedure ddo:sxpath :: query [ns-binding] [num-ancestors] ->
-> node-or-nodeset [var-binding] -> nodeset
procedure ddo:txpath :: location-path [ns-binding] [num-ancestors] ->
-> node-or-nodeset [var-binding] -> nodeset
Polynomial-time XPath implementation with distinct document order support.
The API is identical to the API of a context-based SXPath (here we even use
API helpers from "xpath-context.scm"). For convenience, below we repeat
comments for the API (borrowed from "xpath-context.scm").
query - a query in SXPath native syntax
location-path - XPath location path represented as a string
ns-binding - declared namespace prefixes (an optional argument)
ns-binding ::= (listof (prefix . uri))
prefix - a symbol
uri - a string
num-ancestors - number of ancestors required for resulting nodeset. Can
generally be omitted and is than defaulted to 0, which denotes a
_conventional_ nodeset. If a negative number, this signals that all
ancestors should be remembered in the context.
Returns: (lambda (node-or-nodeset . var-binding) ...)
var-binding - XPath variable bindings (an optional argument)
var-binding = (listof (var-name . value))
var-name - (a symbol) a name of a variable
value - its value. The value can have the following type: boolean, number,
string, nodeset. NOTE: a node must be represented as a singleton nodeset.
The result of applying the latter lambda to an SXML node or nodeset is the
result of evaluating the query / location-path for that node / nodeset.
Produces a lazy SXML document, which corresponds to reading a source document in a stream-wise fashion
Support for native sxpath syntax
Converts the lazy result into a list, by forcing all the promises one by one
Converts the lazy node to SXML, by forcing all of its descendants The node itself is not a promise
procedure srl:sxml->xml :: SXML-OBJ [PORT-OR-FILENAME] -> STRING|unspecified Serializes the `sxml-obj' into XML, with indentation to facilitate readability by a human. sxml-obj - an SXML object (a node or a nodeset) to be serialized port-or-filename - an output port or an output file name, an optional argument If `port-or-filename' is not supplied, the functions return a string that contains the serialized representation of the `sxml-obj'. If `port-or-filename' is supplied and is a port, the functions write the serialized representation of `sxml-obj' to this port and return an unspecified result. If `port-or-filename' is supplied and is a string, this string is treated as an output filename, the serialized representation of `sxml-obj' is written to that filename and an unspecified result is returned. If a file with the given name already exists, the effect is unspecified.
procedure srl:sxml->xml-noindent :: SXML-OBJ [PORT-OR-FILENAME] ->
-> STRING|unspecified
Serializes the `sxml-obj' into XML, without indentation.
procedure srl:sxml->html :: SXML-OBJ [PORT-OR-FILENAME] -> STRING|unspecified Serializes the `sxml-obj' into HTML, with indentation to facilitate readability by a human. sxml-obj - an SXML object (a node or a nodeset) to be serialized port-or-filename - an output port or an output file name, an optional argument If `port-or-filename' is not supplied, the functions return a string that contains the serialized representation of the `sxml-obj'. If `port-or-filename' is supplied and is a port, the functions write the serialized representation of `sxml-obj' to this port and return an unspecified result. If `port-or-filename' is supplied and is a string, this string is treated as an output filename, the serialized representation of `sxml-obj' is written to that filename and an unspecified result is returned. If a file with the given name already exists, the effect is unspecified.
procedure srl:sxml->html-noindent :: SXML-OBJ [PORT-OR-FILENAME] ->
-> STRING|unspecified
Serializes the `sxml-obj' into HTML, without indentation.
procedure srl:parameterizable :: SXML-OBJ [PORT] {PARAM}* ->
-> STRING|unspecified
sxml-obj - an SXML object to serialize
param ::= (cons param-name param-value)
param-name ::= symbol
1. cdata-section-elements
value ::= (listof sxml-elem-name)
sxml-elem-name ::= symbol
2. indent
value ::= 'yes | #t | 'no | #f | whitespace-string
3. method
value ::= 'xml | 'html
4. ns-prefix-assig
value ::= (listof (cons prefix namespace-uri))
prefix ::= symbol
namespace-uri ::= string
5. omit-xml-declaration?
value ::= 'yes | #t | 'no | #f
6. standalone
value ::= 'yes | #t | 'no | #f | 'omit
7. version
value ::= string | number
ATTENTION: If a parameter name is unexpected or a parameter value is
ill-formed, the parameter is silently ignored. Probably, a warning message
in such a case would be more appropriate.
Example:
(srl:parameterizable
'(tag (@ (attr "value")) (nested "text node") (empty))
(current-output-port)
'(method . xml) ; XML output method is used by default
'(indent . "\t") ; use a single tabulation to indent nested elements
'(omit-xml-declaration . #f) ; add XML declaration
'(standalone . yes) ; denote a standalone XML document
'(version . "1.0")) ; XML version