In the previous notes, I demonstrated using XSL transformations when the processor is the browser, Firefox in my examples. That usage is growing and may in time be a very useful tool, although it is not widespread yet. However, one usage that is widespread is standalone transformation of xml input files using XSL transformations and producing xml or xhtml files as output. In this usage mode, the XSLT engine is a separate program, not a browser that incorporates an engine.
Examples of such engines are Xalan from the Apache project and MSXML and MSXSL from Microsoft. For the examples in these notes, I am using the Microsoft products. Download and install both the parser MSXML and the command line Transformation Utility MSXSL (from MS or local copies).
As described previously, using XSL is a transformation process. The input file is an XML file, the program is an XSL program (also an XML formatted file) and the output is usually an XML (or XHTML) file.
The MSXSL documentation describes its command line syntax. The simplest MSXSL command line looks like this.
> MSXSL source.xml transform.xsl -o output.txt
There are a few options on the command line that may occasionally be useful (for example,
-pi
to use the stylesheet linked in source document) but usually
the basic command line does what you need.
Named constants are a help in keeping code organized and maintainable. In XSLT,
a variable is a name that may be bound to a value. The value to which a variable is
bound can be any object that can be returned by an expression (node-set, boolean, number,
or string). There are two xsl elements that can be used to bind variables:
xsl:variable
and xsl:param
.
The scope of the binding is hierarchical with the structure of the document. The named object is visible in all enclosed scopes unless shadowed by a another binding to the same name. Note that the inner binding does not change the original "variable", but rather it creates a new local variable with a different value that shadows the original.
Both xsl:variable
and xsl:param
are allowed as top level
elements (ie, children of the root element) as well as in template elements and their children.
XSL templates can define parameters that are used by the templates
instead of hard coded values. In the top level <xsl:transform>
element or one of the
<xsl:template>
elements, the parameters are
specified using <xsl:param>
child elements. Every
point where the value of the parameter is needed, you supply the name
of the parameter preceded by a dollar sign $. Inside attribute values, you
use {...}
to get the literal value of the parameter, if
that's what you need.
Our first example builds a simple table of the elements based on
the allelements.xml
file that we have been using. I want to parameterize
the sorting that is used. So the file contains these entries.
<?xml version="1.0"?>
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="sortKey">SYMBOL</xsl:param>
<xsl:param name="sortType">text</xsl:param>
<xsl:param name="sortOrder">ascending</xsl:param>
...
<xsl:sort order="{$sortOrder}" data-type="{$sortType}" select="*[name()=$sortKey]" />
...
Each transformation engine specifies a way to pass parameters into an
XSL program if desired. For MSXSL, you just include param=value
on the command line. Thus, in
order to pass in the name of the child element to sort on, I used the following command line.
> msxsl allelements.xml build-atom-table.xsl -o atom-table.html sortKey=NAME
This command read in the allelements.xml file, transformed it according to build-atom-table.xsl, and wrote the result into atom-table.html.
Sometimes it is necessary to process elements more than one way in the course of building a
page. The optional mode attribute on xsl:template
and
xsl:apply-templates
provide a way to process an element multiple times, each time
producing a different result. On xsl:template
elements, specify the
mode attribute and give it a value, one for each style of processing.
Then on the
xsl:apply-templates
element, specify the mode attribute and give it the appropriate value
for this invocation.
> msxsl popclocks.xml build-population-table.xsl -o population.html headerLevel=4
... to transform popclocks.xml with
build-population-table.xsl to produce
population.html. The popclocks.xml
file is an instance of one of the RSS
feeds that the US Census Bureau makes available.
We have discussed XPath expressions several times. The most important kind of expression is a location path which selects a set of nodes relative to the context node. We have been using relatively simple location paths, but there are a number of features that can be included in them to address nodes in more complex fashions than just "child of child" type specifications.
A location path consists of a sequence of one or more location steps separated by /. Each location step has three parts:
There are several different axes available. We have been using abbreviations to specify them in our paths, but there are more options. The following two tables are copied directly from XML 1.1 Bible, chapter 15, XSL Transformations.
Axis | Selects From |
---|---|
ancestor |
The parent of the context node, the parent of the parent of the context node, the parent of the parent of the parent of the context node, and so forth back to the root node |
ancestor-or-self |
The ancestors of the context node and the context node itself |
attribute |
The attributes of the context node |
child |
The immediate children of the context node. This is the default axis. |
descendant |
The children of the context node, the children of the children of the context node, and so forth |
descendant-or-self |
The context node itself and its descendants |
following |
All nodes that start after the end of the context node, excluding attribute and namespace nodes |
following-sibling |
All nodes that start after the end of the context node and have the same parent as the context node |
namespace |
The namespace of the context node |
parent |
The unique parent node of the context node |
preceding |
All nodes that finish before the beginning of the context node, excluding attribute and namespace nodes |
preceding-sibling |
All nodes that start before the beginning of the context node and have the same parent as the context node |
self |
The context node |
The abbreviations that we have been using are much preferred if they can be applied.
Abbreviation | Full Meaning |
---|---|
. |
self::node() |
.. |
parent::node() |
name |
child::name |
@name |
attribute::name |
// |
/descendant-or-self::node()/ |
We don't usually need to worry about the node tests, because we can just use the name of the node of interest. But for the record, here are the node tests.
name
of the nodenode()
- any node typecomment()
- any comment nodetext()
- any text nodeprocessing-instruction()
- a processing instructionThe predicate (enclosed in [...]
) filters a node-set
to produce a new node-set. For each node in
the node-set to be filtered, the predicate expression is evaluated with that
node as the context node, with the number of nodes in the node-set as
the context size, and with the proximity position of the node in the
node-set with respect to the axis as the context position; if
the expression evaluates to true for that node, the node is included in
the new node-set; otherwise, it is not included.
For some of the classes that I teach (for example CSE/INFO 100), I am producing a news feed for the homework. The way I do this is by making sure that the calendar page is valid XHTML (and therefore valid XML), using a few special attributes on the entries as I update the calendar with new assignments (readme-atom-feed.txt), and then running an XSLT transformation build-homework-atom.xsl against the schedule using the Apache Xalan XSLT transformation processor (using build-homework-atom) every time I make a change in the assignments. This produces a news reader xml page homework100.xml. The build-homework-atom.xsl file contains examples of more elaborate location paths.