DocBook

Revision 12 as of 2006-05-23 22:09:03

Clear message

1. Purpose

The purpose of this page is to give an overview of the DocBook format. It offers an explanation of the advantages of this format, links further reading on this subject, and contains a short tutorial. For an explanation of the Ubuntu Documentation Project's usage of Docbook see the [Ubuntu Docbook Interchange Protocol].

2. What is DocBook?

DocBook is an XML based standard, which is used in many of today's documentation tasks. Practically speaking, when you want to create a DocBook document source, you write XML files which describe the document's layout, paragraph division and other attributes. XML file structure might look familiar to you if you have seen HTML code before. XML tends to be an improvment over the more "ancient" HTML specification and can be used to produce complete web pages and other markup documents.

3. What are the Advantages of Docbook?

Docbook is an OASIS standard and the format in which most open source projects store their documentation. Docbook is developed as an open source application. The project is hosted at SourceForge and is made available under the GPL. Docbook is available as a Document Type Definition (DTD) and XML Schema (XSD). The project has a large developer and support community spanning both open source and commercial groups.

The most important reasons why the project uses Docbook include:

  1. Docbook is a standard
  2. Docbook is open source
  3. Docbook is used by most major projects
  4. Docbook has a large developer and support community

However, Docbook is an XML application and XML technologies also solve a number of publishing problems for documentation teams, including:

  • Single-sourcing
  • Collaborative authoring
  • Cross-platform editing
  • Multi-channel publishing
  • Improving information quality and consistency
  • Enhancing functionality of electronic output
  • Negating vendor lock-in

More information on these points can be found at http://www.sastc.org.za/articles/xml-solv-prop.html

If you already understand XML then you are in a good position to start learning Docbook. If you don't understand XML then, the good news is that learning Docbook will help you learn XML. Below are two books that are a must read for anyone just starting with Docbook.

4. Further Reading

If you have installed the package 'docbook-defguide' you can access the guide either through your web browser as:

http://localhost/doc/docbook-defguide/html/docbook.html (assuming that your Apache still has /doc aliased to /usr/share/doc) 

You might as well access it from the command line using::

lynx /usr/share/doc/docbook-defguide/html/docbook.html

While reading these works it is useful to experiment. For this you will need an XML publishing tool-chain and an XML Editor. The Docbook Web site and Wiki will provide you with links to more information on the tool chain and editors you can use to author Docbook documents.

For explanation of the Ubuntu Documentation Projects usage of Docbook see the "[Ubuntu Docbook Interchange Protocol]."

5. Quick Tutorial

5.1. What does DocBook look like?

Actually DocBook defines a number of 'tags' just like HTML does. To set the author's name you would write something like...

 <author>
 Christoph Haas
 </author>

As you can see this is very similar to HTML. In a minute I will show you a fully working example of a complete XML document. Give me a second.

The 'flavor' we are using to write these tags is XML. So it is called DocBook/XML. (The other 'flavor' would be SGML which is not much different. XML is just more strict. Actually HTML is a kind of SGML language. Most people believe that SGML is deprecated. Thus documentation in Debian is currently converted to XML.) Even if you have not yet used XML you won't have much trouble.

5.2. Style sheets

To create the output document from your XML input you also need a style sheet. Stylesheets are called 'XSL Transformations' (XSLT) and are written in a language called 'Extensible Stylesheet Language' (XSL). Basically XSLT describes how to convert one document into another. Usually you won't need to know how style sheets look. In fact they are very ugly. Finally you need a 'processor' that take the XML and the XSLT and creates the output file from it. We will use the free 'xsltproc' program for that purpose.

There is a number of stylesheets available to convert your document into:

- Postscript - PDF - XHTML - man - texinfo

Other converters exist to convert DocBook into formats like Yelp (the Gnome Help format).

5.3. Hello World

Enough with the dry theory. You are probably already yearning to get your first 'Hello World' document in XML. First you need to install the following packages:

- xsltproc (the XSL Transformations Processor)
- docbook-xsl (stylesheets for HTML, XHTML, HTML Help and others)
- docbook-defguide (The Definitive Guide to DocBook - recommended)

Enter these lines into a file and call it test.xml:

 <?xml version="1.0" encoding="UTF-8"?>
 <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://docbook.org/xml/4.2/docbookx.dtd">
 <article>
   <title>My first Docbook document</title>
   <sect1>
      <title>The greeting</title>
      <para>
        Hello world
      </para>
   </sect1>
 </article>

Please note that you should really really use UTF-8 as a character encoding. You probably need to switch your terminal to UTF-8 mode and your editor, too.

Now run this command::

{{{ xsltproc -o test.html /usr/share/xml/docbook/stylesheet/nwalsh/xhtml/docbook.xsl test.xml }}}

You should find a file 'test.html' in the current directory. View it with your favorite web browser. Congratulations. (Output page: http://workaround.org/ubuntu/test.html)

Now what did that line actually do?

'xsltproc' is the converter program. '-o test.html' sets the output file. The next parameter '.../docbook.xsl' is the stylesheet you are using for the conversion - this one converts XML to XHTML. And finally the 'test.xml' tells xsltproc where your input file is located.

5.4. Customising style sheets

You will probably be disappointed about the look of the DocBook output. Yes, it's great to have it convert the document automatically. But it probably does not fit into your web design or 'corporate identity' at all. There is however a remedy.

Style sheets usually provide a number of knobs and wheels that you can turn. Usually you write your own stylesheet that imports the 'standard' style sheet. Let me give an example::

 <?xml version='1.0'?>
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:import href="/usr/share/xml/docbook/stylesheet/nwalsh/xhtml/docbook.xsl"/>
    <xsl:param name="toc.max.depth">1</xsl:param>
    <xsl:param name="html.stylesheet" select="'/ubuntu.css'"/>
    <xsl:template name="user.header.content">
      <a href="/">Back to main page</a>
    </xsl:template>
 </xsl:stylesheet>

This little stylesheet first imports the docbook.xsl I mentioned earlier. But it also sets a few parameters:

- Set the maximum depth of the TOC (table of contents) to '1'. So

  • only the <sect1> sections will be included in the TOC.

- The final XHTML document will use the 'ubuntu.css' style sheet (CSS). - Include a link to the main page on top of the page.

Of course these settings only work with the XHTML style sheet. For other output format you need other settings. The settings above are documented at /usr/share/doc/docbook-xsl/doc/html/index.html

You will also want to use http://www.sagehill.net/docbookxsl/ as a reference.

5.5. Makefile

If you have multiple XML files or style sheets you may want to have all the processing done in a Makefile. This is an example::

 # Add your language file here:
 TARGETS = faq.html

 XSLTPROC = /usr/bin/xsltproc
 XSL = ubuntu.xsl

 %.html: %.xml $(XSL)
    @$(XSLTPROC) -o $@ $(XSL) $<

 all: $(TARGETS)

 clean:
    @rm -f *.html

6. Editing Programs

  1. Bluefish
  2. conglomerate (WYSIWYG)
    • Somewhat beta. Doesn't hide the gory details. You still need to read the DocBook reference. Just makes it graphical.

    • See http://www.conglomerate.org/

  3. VIM file type plugin "xmledit"
  4. EMACS XML support
    • Some say DocBook is easy to write only under psgml. Some use Emacs only for psgml-mode.

    • [http://www.thaiopensource.com/nxml-mode/ nxml-mode] however is far superior to psgml-mode. It does real-time syntax and error highlighting.

  5. See also DocBookEditors.

*Christoph Haas*


CategoryDocteam