15 February 2011

How to Access the DocBook DTDs When oasis-open.org Blocks You

Have you ever used the DocBook DTDs (or XSDs) and been blocked by the oasis-open.org website after a handful of accesses to them? If so, here's how to work around this problem. Granted, the solution I offer is specifically for Mac OS X, as long as you can run a webserver on your operating system, the same solution will still work (after all, Mac OS X is really just BSD Unix).

While working on the ActiveMQ In Action book, we were originally using the DocBook 4.5 DTD directly from the oasis-open.org website. Every time I built the book to transform the DocBook XML into PDFs, the Maven build would access the DTDs directly. Pretty standard stuff when the DTDs are not already packaged in a JAR so that they can be accessed locally. I wasn't happy with build ing needing to grab these DTDs for every iteration, but since I have a fast connection it wasn't a big deal for me. However, after a very short period of time, the oasis-open.org website blocked me from accessing the DTDs. This was a pain because it caused the build to fail. To work around this problem, here's what I did.

I simply downloaded the DTDs and related files that I needed, put the DTD and friends in /Library/WebServer/Documents/docbook/xml/4.5/ directory on my local hard drive and added the following entry to the /etc/hosts file so that any requests to oasis-open.org to point to my local machine: www.oasis-open.org

Then I just started up web sharing on the MacBook Pro I used to work on the book and voila! I no longer had to access the DTDs online anymore and the fact that oasis-open.org blocked my IP address didn't matter anymore.

This really wasn't a big deal for me, but I question the motive for making the DTDs publicly available and HTTP accessible if there are rules for accessing them. Why not post work arounds such as this to the oasis-open.org site? Why not just simply post a page of rules for accessing them? Of course, these items may be posted somewhere and I just wasn't able to find them. If so, that's a problem as well. Hint! Hint!


  1. Have you tried Apache's own XML Commons Resolver class?


    The docs are not intuitive (at least not last time I read them, ages ago), but the Resolver is exactly the thing you should be using in the guts of your XML stack to fix this problem - without having to futz with system-level stuff like your hosts file.

  2. You might consider using a local (or organization scoped) maven repository-manager. I used artifactory to great effect; its basically a caching proxy. Everyone in the office hits the local repo manager, and whatever packages it pulls off the internet are kept for the next person. There is an admin page so you can purge out of date packages.


  3. You can comment out the declaration line like this:

    <!-- <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"> -->

    Then find your local copy of the docbookx.dtd file.

    Then insert a new declaration line with the path to your local docbookx.dtd file, like this example, in which the file is in the /usr/share/xml/docbook/schema/dtd/4.1/ directory:

    <!DOCTYPE book PUBLIC "-//LOCAL//DTD DocBook XML V4.1.2//EN" "file:///usr/share/xml/docbook/schema/dtd/4.1/docbookx.dtd">