Server Side XSLT2 with Apache, Tomcat and Saxon

I have spent several days trying to figure out how to do server side XSLT using an Apache server using numerous suggested ways from articiles on the web about how to do it using Cocoon, and using mod-xslt2, all of which are out of date in one way or another, i.e. mod-xslt2 doesn’t work with Apache 2.2 yet, and the article describing how to do it with Cocoon used a previous version of Cocoon so didn’t help much with the current one.

However, I have now managed server side XSLT2 transformation using Apache, Tomcat, mod-jk, mod-rewrite and Saxon as the transformation engine.

The Theory

I noticed in the Saxon documentation, that one of the demos was a simple servlet. So, running this servlet on Tomcat, I should be able to do the transformations on the server.

The downside of this approach is that I was going to end up with really nasty looking URLs, such as

http://www.screwtape.co.uk/Servlets/SaxonServlet?source=xml/homePage.xml&style=xsl/main.xsl

which is hardly convenient when writing links, and while that could be automated in the xsl transformation, it would still look ugly in the browser address or status bars.

However, Apache has mod-rewrite which does lots of exciting things with URLs. This can rewrite any incoming URL to poke the relevant parameters into the servlet.

The Practice

Install Apache, Tomcat and mod-jk. There’s loads of instructions for that on the web, and anything I say will have been better said elsewhere so I won’t.

Everything here is rather distibution dependant, so your system may be slightly different, but the ideas should be similar. I’m using Centos 5.2 here.

  1. Download and install SaxonB 9.1 for Java. Make sure you also get resources package, as that has the servlet sample.
  2. Complile the SaxonServlet.java file, and create a .war file. To do this, I created a web application in Netbeans 6.5 and got it to do all the compiling and packaging etc.
  3. Using the tomcat manager app, add and release the .war file.
  4. Place or link your xml and xsl files in the servlet directory, so the servlet can access them. I have /usr/share/tomcat5/webapps/Servlets/xml and /usr/share/tomcat5/webapps/Servlets/xsl linked to my webserver directories, although you could have the xml and xsl files just in the servlet directory.
  5. I use the default virtual host, and hold the majority of the website files in the apache html directory, therefore you can’t just use the auto-generated virtual host file.Update the /usr/share/tomcat5/conf/servers.xml file: the <Host> needs some <Alias> tags to allow Tomcat to see your website when not accessed through localhost. The host section look something like this:
    <Host name="localhost" appBase="webapps"
        unpackWARs="true" autoDeploy="true"
        xmlValidation="false" xmlNamespaceAware="false">
        <Alias>www.screwtape.co.uk</Alias>
        <Alias>screwtape.co.uk</Alias
        ...
    </Host>
  6. Restart your tomcat instance, and this will create the “auto” subdirectory in the /usr/share/tomcat5/conf directory.
  7. Look through the mod_jk.conf in the auto directory, and you should see a pair of lines starting JkMount. Copy these lines into the main httpd.conf file area, if like me you’re not using virtual hosts, or into the relevant <VirtualHost> element. You should have something like this:
    JkMount /Servlets ajp13
    JkMount /Servlets/* ajp13
  8. Restart httpd, then we can test that you can run the servlet from Apache. So from your browser try
    http://localhost/Servlets/SaxonServelt

    This should show an error message about missing style parameter. If you get this message then Apache and Tomcat are linked and you can proceed.

    It is probably a good idea to try and access the servelet from other hostnames/domain names that might use it, since it may work for localhost and no other hosts. Thus trying

    http://<yourhostname>/Servlets/SaxonServlet

    may be a good idea too.

  9. Now we need to rewrite the incoming URLs to use the servlet. I have all my URLs beginning xml/ rewritten to use the servlet, with a single xsl transformation. If you need different xsl files, then this will be slightly more tricky, and will need a separate rewrite rule for each xsl file, or a more complex servelet.Add the RewriteRule directives in the <Directory element appropriate for your site. I have
    <Directory "/var/www/html">
        ...
        RewriteEngine       on
        RewriteRule         (^xml/.*$)      /Servlets/SaxonServlet?source=$1&style=/xsl/main.xsl
    </Directory>
  10. Restart httpd and assuming everything is ok, you should get your xml documents transformed using the specified xsl. That’s how this page works, as it’s entirely written in xml.

I hope this is useful to someone, and if anyone has a better way of doing it I’ll be interested to know. One of my friends insists PHP is the way to go, but I can’t see that being any easier than this.