Toolchain

In this section, we will see how to transform a DocBook file into a PDF or an (X)HTML file by adding an intermediate pMML2SVG step to transform MathML into SVG.

The DocBook example source code that I will use can be seen in Appendix A, DocBook test file source code . Suppose in this section that we have a file, test.xml, containing this code.

Do not forget to adapt files path to your system installation when testing the commands described in this section.

To transform a DocBook file into a PDF, including pMML2SVG step, we have to do three transformations. First, the DocBook file will be transformed into an XSL-FO file. This transformation is done by using the DocBook-XSL XSLT stylesheets[1] and an XSLT 1 processor. Let see some command examples using different XSLT processors.

xsltproc can be used to transform test.xml with this command (if xsltproc is already installed):

xsltproc -o test.fo docbook-xsl-1.74.3/fo/docbook.xsl test.xml

Saxon 6.5.5 can also be used to do this first transformation, the following command must be used:

java -jar saxon.jar -o test.fo test.xml 
     docbook-xsl-1.74.3/fo/docbook.xsl

FOP is also able to do this transformation with the following command:

fop -xml test.xml -xsl docbook-xsl-1.74.3/fo/docbook.xsl
    -foout test.fo

All these commands will create a file called test.fo containing the XSL-FO code that will be used to render a PDF file. If you transform directly a PDF file with this code, the MathML source code will be displayed in red on the PDF page.

The second transformation of our toolchain is MathML to SVG transformation using pMML2SVG. This time, we need an XSLT 2 processor and we will use Saxon 9. We will also use the pMML2SVG stylesheet named fopmml2svg.xsl from the tools directory of pMML2SVG distribution. This stylesheet is used to transform all MathML that it will find inside the XSL-FO code. The following command line is used:

java -jar saxon9.jar -xsl:pMML2SVG/tools/fopmml2svg.xsl
     -o:mathml_test.fo test.fo

This command creates a file called mathml_test.fo containing XSL-FO code with MathML transformed into SVG. This file will be used to render the final PDF output file.

The last step of our toolchain is to transform the mathml_test.fo file into a PDF file. This transformation is done by using FOP with the following command:

fop mathml_test.fo test.pdf

You can now open test.pdf with your favorite PDF viewer and see the result.

It is a real waste of time to always type all these commands each time you want to compile your DocBook document into PDF. Therefore, it is useful to write a small script that will do all these commands for you. For example, create a file, called pmml2svgpdf with the following source code:

#!/bin/sh
fop="fop" # Path to FOP

mathxsl="pMML2SVG/tools/fopmml2svg.xsl" # Path to pmml2svg XSLT 
                                        # stylesheet to treat fo
mathtransform="java -jar saxon9.jar -xsl:$mathxsl -o:" # Saxon 9

docbookxsl="docbook-xsl-1.74.3/fo/docbook.xsl" # Path to Norman 
                                               # Walsh XSLT
docbooktransform="$fop -xsl $docbookxsl -xml" # FOP command

for xmlfile in $*; do
  file=${xmlfile%.*}
  
  # First step: DocBook to XSL-FO transformation by using N. Walsh
  # stylesheet.
  fofile=$file.fo
  echo "DocBook to XSL-FO: " $file
  $docbooktransform $xmlfile "-foout" $fofile

  # Second step: MathML to SVG transformation by using pMML2SVG.
  mmlfile=mathml_$file.fo
  echo "MathML Transformation: " $file
  $mathtransform$mmlfile $fofile
  
  # Third step: XSL-FO to PDF computation by using FOP.
  pdffile=$file.pdf
  echo "XSL-FO to PDF: " $file
  $fop $mmlfile $pdffile
  
  # This last command will clean all temporary files created
  # in the toolchain transformation.
  echo "Cleaning temporary file"
  rm -rf $fofile $mmlfile # Remove temporary file
done

Now, you can execute all the toolchain transformations by using this unique command:

./pmml2svgpdf test.xml

You can also use it to treat a group of DocBook file. For example, the command ./pmml2svgpdf *.xml will transform all the XML DocBook files of the current folder into PDF files.

This document was compiled to PDF by using the same script and this toolchain transformation.

To transform a DocBook file into (X)HTML, only one transformation is done. After that transformation, pMML2SVG is used to transform MathML into SVG. The main transformation uses an XSLT 1 stylesheet from Norman Walsh. Therefore, you can use both xsltproc or Saxon. Here are the command lines for each of these processors:

xsltproc -o test.xhtml docbook-xsl-1.74.3/xhtml-1_1/docbook.xsl 
      test.xml
java -jar saxon.jar -o test.xhtml test.xml 
     docbook-xsl-1.74.3/xhtml-1_1/docbook.xsl

If your browser supports MathML rendering (Firefox or Opera for example), you can open test.xhtml with it to see the result. To transform the MathML code into SVG, you have to use htmlpmml2svg.xsl stylesheet (coming with pMML2SVG distribution in the tools/ folder) with an XSLT 2 processor. Here is a command that executes this transformation:

java -jar saxon9.jar -xsl:pMML2SVG/tools/htmlpmml2svg.xsl
     -o:testSVG.xhtml test.xhtml

If your browser supports SVG (the majority of modern browsers do), you can open testSVG.xhtml with it to see the result.

Likewise the PDF transformation, you can also use a script to make the transformation. Here is a sample script code that transforms a DocBook file into an (X)HTML file:

#!/bin/sh
mathxsl="pMML2SVG/tools/htmlpmml2svg.xsl" # pmml2svg html stylesheet
mathtransform="java -jar saxon9.jar -xsl:$mathxsl -o:" # Saxon 9

docbookxsl="docbook-xsl-1.74.3/xhtml-1_1/docbook.xsl" # DocBook to
                                                      # XHTML 
                                                      # stylesheet
docbooktransform="xsltproc --output " # XSLT 1 processor 
                                      # (xsltproc here)

for xmlfile in $*; do
  file=${xmlfile%.*}
  
  # First step: DocBook to XHTML transformation by using N. Walsh
  # stylesheet.
  tempfile=temp_$file.xhtml
  echo "DocBook to HTML: " $file
  $docbooktransform $tempfile $docbookxsl $xmlfile
    
  # Second step: MathML to SVG transformation by using pMML2SVG.
  htmlfile=$file.xhtml
  echo "MathML Transformation: " $file
  $mathtransform$htmlfile $tempfile
  
  # This last command will clean all temporary files created
  # in the toolchain transformation. 
  echo "Cleaning temporary file"
  rm -rf $tempfile
done

This script can also be used to transform a group of DocBook files into (X)HTML files the same way as PDF script does.



[1] These stylesheets can be found on this website: http://docbook.sourceforge.net/