12 - XSL CSE 413 Lecture Notes

Introduction

I have mentioned separating structure from presentation several times in the course of discussing XML, XHTML, etc. This is generally thought to be a good thing, because it allows you to provide different presentations of the data without having to rewrite the basic structure of the information itself. Depending on the needs of the user, data can be presented with a different color scheme, in different fonts, linearly or in floating blocks, etc, all without changing the basic content files themselves.

There is another level of decomposition that is sometimes helpful to apply. That is splitting out the information content of the basic data from the structure of the basic data. That is to say, in some sense the structure is also a variable attribute of the data, and there are times when the same information might be structured one way for one usage, and another way for another usage. In both cases, we want to retain the ability to further style the content for presentation.

XSLT Evolution

This situation is illustrated in this diagram taken from the article An Introduction to Client-Side XSLT on Digital Web Magazine.

Initially, data (information content), structure, and design (presentation) were all managed using the ad-hoc tools of HTML, as designed during the browser wars by competing programmers at Microsoft and Netscape. The advent of CSS and XHTML meant that we could separate the presentation from the information content and structure.

There is one further tool that has been defined and is very useful, both in creating web pages and in the more general application of structured XML data files: Extensible Stylesheet Language Transformation XSLT. With XSLT, we can transform the structure of an XML file into another XML file, an XHTML file, or essentially any arbitrary format that is needed. If the resulting file is intended for presentation to the user, it can be further styled using CSS. If the result is intended for another application, then it can be used as is.

XSL / XSLT

XSL Transformations (XSLT) is an XML language for transforming XML documents from one syntax to another. An XSLT document can be thought of as a program that is interpreted by a processor. The program is supplied an input XML document (the source tree) and it produces a second document (the result tree) as output. The format of the result tree is entirely dependent on the XSLT program (or stylesheet).

Applying XSLT

XSLT is an XML application, that is, an XSLT stylesheet file is written in a dialect of XML. XSLT is a relatively simple pattern based language, although it can be confusing at first. An XSLT stylesheet consists of a collection of template rules which define the transformations to be performed. These template rules can be applied recursively.

The XSLT processor checks which template rules can be applied and executes the associated transformations based on a sequence of priorities. XSLT processors can be standalone programs (eg, Apache Xalan, MSXSL) or part of a browser or other program (eg, Firefox, IE 6, libxml2). Either way, the source file is read in, the transformation is applied, and the result is made available in whatever medium was specified.

Example

As an example, recall the simple XML file ex11/twoelements.xml from the previous set of notes. That file has no attached transformation or style sheet, so the browser (Firefox and IE6, at least) display the XML directly.

For these XSL notes, I've written a short XSLT transformation ex12/build-element-page.xsl and associated it with the XML file by adding an <?xml:stylesheet ...?> processing instruction in the data file. Loading the new version of the file causes it to be transformed before being displayed. For example, see ex12/twoelements.xml and ex12/allelements.xml. Both of these files transformed using XSLT, then the CSS presentation styles are applied using ex12/elements.css.

XSL Templates

The key concept with XSL is the idea that template rules are matched to particular nodes in the source tree, and the output specified by the template is then written out. So there are two important issues:

  1. How does matching work?
  2. How does the template specify the output to be generated?

A template is an XML element. The name of the element is xsl:template. The element has an attribute named match. The value of the match attribute is an XPath expression that specifies the nodes in the source tree to match. So for example, the following template will be applied to the root node:

  1. <xsl:template match="/">
  2. ...
  3. </xsl:template>

The result tree (the output document) is constructed by processing a list initially containing just the root node. A node is processed by finding all the template rules with patterns that match the node, and choosing the best amongst them. Then the chosen rule's template is instantiated with the node as the current node and with the list of source nodes as the current node list. A template typically contains instructions that select an additional list of source nodes for processing. The process of matching, instantiation and selection is continued recursively until no new source nodes are selected for processing.

So the root node / is matched first. In our example, the template for the root node is

  1. <xsl:template match="/">
  2.  
  3. <html xmlns="http://www.w3.org/1999/xhtml">
  4. <xsl:attribute name="xml:lang">en</xsl:attribute>
  5. <xsl:attribute name="lang">en</xsl:attribute>
  6.  
  7. <head>
  8. <meta http-equiv="content-type" content="text/html;charset=utf-8" />
  9. <meta http-equiv="content-style-type" content="text/css" />
  10. <link>
  11. <xsl:attribute name="href">elements.css</xsl:attribute>
  12. <xsl:attribute name="rel">stylesheet</xsl:attribute>
  13. <xsl:attribute name="type">text/css</xsl:attribute>
  14. </link>
  15. <title>Elements Table</title>
  16. </head>
  17.  
  18. <xsl:apply-templates/>
  19. </html>
  20. </xsl:template>

The template is defined by the start and end tags <xsl:template match="/">...</xsl:template>. Everthing in between those tags is the template. Most of the text is output as given, with some changes for the attributes of the html and link tags. The element that "selects an additional list of source nodes" is <xsl:apply-templates/>. This tells the XSLT processor to compare each child node of the source node (in this case, the root) against the templates in the style sheet, and if a match is found, then output the template for the matched code.

When the <xsl:apply-templates/> element is processed, the processor finds that there is a child element PERIODIC_TABLE. It finds the appropriate template and applies that.

  1. <xsl:template match="PERIODIC_TABLE">
  2. <body xmlns="http://www.w3.org/1999/xhtml">
  3. <table>
  4. <tr>
  5. <th>Name</th>
  6. <th>Symbol</th>
  7. <th>Number</th>
  8. </tr>
  9. <xsl:apply-templates select="ATOM">
  10. <xsl:sort select="SYMBOL"/>
  11. </xsl:apply-templates>
  12. </table>
  13. </body>
  14. </xsl:template>

The processor emits the first part of the body of the html document, then encounters another <xsl:apply-templates> element. There is an xsl:sort child element enclosed in the body of the <xsl:apply-templates> element, so the selected ATOM nodes are sorted according to their SYMBOL elements. Then, applying the known templates to the ATOM children of the PERIODIC_TABLE, the processor finds two matches and processes those.

  1. <xsl:template match="ATOM">
  2. <xsl:if test="ATOMIC_NUMBER&lt;10">
  3. <tr xmlns="http://www.w3.org/1999/xhtml">
  4. <td><xsl:value-of select="NAME"/></td>
  5. <td><xsl:value-of select="SYMBOL"/></td>
  6. <td><xsl:value-of select="ATOMIC_NUMBER"/></td>
  7. </tr>
  8. </xsl:if>
  9. </xsl:template>

The ATOM template outputs selected values from the children of each ATOM node, along with some html markup to build a table.

The ATOM template does not select any more nodes, so recursive processing stops at this level and returns to the enclosing PERIODIC_TABLE template. The closing tags for the table and the body are emitted, and then that template is exhausted. Processing control returns to the initial root template, which emits the closing html tag, and then we're done!

A complete html document has been created based on an original xml document and an associated transformation. Notice that the created html document actually contains a reference to a CSS style sheet, so the presentation is polished off using the styles defined in that style sheet.

Useful xsl elements

There are numerous elements defined in the XSLT language. Refer to the various XSL resources for tutorials and more details. The following tags will get you started.