13 - More XSL, CSE 413, Autumn 2005

Introduction

In the previous notes, I demonstrated using XSL transformations when the processor is the browser, Firefox in my examples. That usage is growing and may in time be a very useful tool, although it is not widespread yet. However, one usage that is widespread is standalone transformation of xml input files using XSL transformations and producing xml or xhtml files as output. In this usage mode, the XSLT engine is a separate program, not a browser that incorporates an engine.

Examples of such engines are Xalan from the Apache project and MSXML and MSXSL from Microsoft. For the examples in these notes, I am using the Microsoft products. Download and install both the parser MSXML and the command line Transformation Utility MSXSL (from MS or local copies).

Process

As described previously, using XSL is a transformation process. The input file is an XML file, the program is an XSL program (also an XML formatted file) and the output is usually an XML (or XHTML) file.

The MSXSL documentation describes its command line syntax. The simplest MSXSL command line looks like this.

> MSXSL source.xml transform.xsl -o output.txt

There are a few options on the command line that may occasionally be useful (for example, -pi to use the stylesheet linked in source document) but usually the basic command line does what you need.

Constants (aka Variables)

Named constants are a help in keeping code organized and maintainable. In XSLT, a variable is a name that may be bound to a value. The value to which a variable is bound can be any object that can be returned by an expression (node-set, boolean, number, or string). There are two xsl elements that can be used to bind variables: xsl:variable and xsl:param.

The scope of the binding is hierarchical with the structure of the document. The named object is visible in all enclosed scopes unless shadowed by a another binding to the same name. Note that the inner binding does not change the original "variable", but rather it creates a new local variable with a different value that shadows the original.

Both xsl:variable and xsl:param are allowed as top level elements (ie, children of the root element) as well as in template elements and their children.

Parameters

XSL templates can define parameters that are used by the templates instead of hard coded values. In the top level <xsl:transform> element or one of the <xsl:template> elements, the parameters are specified using <xsl:param> child elements. Every point where the value of the parameter is needed, you supply the name of the parameter preceded by a dollar sign $. Inside attribute values, you use {...} to get the literal value of the parameter, if that's what you need.

Our first example builds a simple table of the elements based on the allelements.xml file that we have been using. I want to parameterize the sorting that is used. So the file contains these entries.

<?xml version="1.0"?>
<xsl:transform version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="sortKey">SYMBOL</xsl:param>
<xsl:param name="sortType">text</xsl:param>
<xsl:param name="sortOrder">ascending</xsl:param>
...
<xsl:sort order="{$sortOrder}" data-type="{$sortType}" select="*[name()=$sortKey]" />
...

Each transformation engine specifies a way to pass parameters into an XSL program if desired. For MSXSL, you just include param=value on the command line. Thus, in order to pass in the name of the child element to sort on, I used the following command line.

> msxsl allelements.xml build-atom-table.xsl -o atom-table.html sortKey=NAME

This command read in the allelements.xml file, transformed it according to build-atom-table.xsl, and wrote the result into atom-table.html.

Modes

Sometimes it is necessary to process elements more than one way in the course of building a page. The optional mode attribute on xsl:template and xsl:apply-templates provide a way to process an element multiple times, each time producing a different result. On xsl:template elements, specify the mode attribute and give it a value, one for each style of processing. Then on the xsl:apply-templates element, specify the mode attribute and give it the appropriate value for this invocation.

For example, I used this command line ...

> msxsl popclocks.xml build-population-table.xsl -o population.html headerLevel=4

... to transform popclocks.xml with build-population-table.xsl to produce population.html. The popclocks.xml file is an instance of one of the RSS feeds that the US Census Bureau makes available.

Location paths

We have discussed XPath expressions several times. The most important kind of expression is a location path which selects a set of nodes relative to the context node. We have been using relatively simple location paths, but there are a number of features that can be included in them to address nodes in more complex fashions than just "child of child" type specifications.

A location path consists of a sequence of one or more location steps separated by /. Each location step has three parts:

an axis, which specifies the tree relationship between the nodes selected by the location step and the context node
a node test, which specifies the name or node type of the nodes selected by the location step
zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.

There are several different axes available. We have been using abbreviations to specify them in our paths, but there are more options. The following two tables are copied directly from XML 1.1 Bible, chapter 15, XSL Transformations.

Axis	Selects From
`ancestor`	The parent of the context node, the parent of the parent of the context node, the parent of the parent of the parent of the context node, and so forth back to the root node
`ancestor-or-self`	The ancestors of the context node and the context node itself
`attribute`	The attributes of the context node
`child`	The immediate children of the context node. This is the default axis.
`descendant`	The children of the context node, the children of the children of the context node, and so forth
`descendant-or-self`	The context node itself and its descendants
`following`	All nodes that start after the end of the context node, excluding attribute and namespace nodes
`following-sibling`	All nodes that start after the end of the context node and have the same parent as the context node
`namespace`	The namespace of the context node
`parent`	The unique parent node of the context node
`preceding`	All nodes that finish before the beginning of the context node, excluding attribute and namespace nodes
`preceding-sibling`	All nodes that start before the beginning of the context node and have the same parent as the context node
`self`	The context node

The abbreviations that we have been using are much preferred if they can be applied.

Abbreviation	Full Meaning
`.`	`self::node()`
`..`	`parent::node()`
`name`	`child::name`
`@name`	`attribute::name`
`//`	`/descendant-or-self::node()/`

We don't usually need to worry about the node tests, because we can just use the name of the node of interest. But for the record, here are the node tests.

name of the node
node() - any node type
comment() - any comment node
text() - any text node
processing-instruction() - a processing instruction

The predicate (enclosed in [...]) filters a node-set to produce a new node-set. For each node in the node-set to be filtered, the predicate expression is evaluated with that node as the context node, with the number of nodes in the node-set as the context size, and with the proximity position of the node in the node-set with respect to the axis as the context position; if the expression evaluates to true for that node, the node is included in the new node-set; otherwise, it is not included.

For some of the classes that I teach (for example CSE/INFO 100), I am producing a news feed for the homework. The way I do this is by making sure that the calendar page is valid XHTML (and therefore valid XML), using a few special attributes on the entries as I update the calendar with new assignments (readme-atom-feed.txt), and then running an XSLT transformation build-homework-atom.xsl against the schedule using the Apache Xalan XSLT transformation processor (using build-homework-atom) every time I make a change in the assignments. This produces a news reader xml page homework100.xml. The build-homework-atom.xsl file contains examples of more elaborate location paths.