Chapter: XML and Web Services : Building XML-Based Applications : Parsing XML Using Document Object Model

DOM Implementations: JDOM, NanoXML, TinyXML, kXML

For a variety of reasons, some have argued that DOM as specified by the W3C is not the best way to go. One reason is that it’s too complex. In this case, JDOM has appeared as an alternative.

Other DOM Implementations

For a variety of reasons, some have argued that DOM as specified by the W3C is not the best way to go. One reason is that it’s too complex. In this case, JDOM has appeared as an alternative. Another reason is that DOM takes too much memory and is not practical for resource-constrained devices such as PDAs and cellular phones. For these applica-tions, a number of DOM-like APIs have appeared. In this section, we’ll look at some of these alternative implementations.

JDOM

JDOM is not an acronym. It was originally developed as an open-source API for XML but has been accepted by the Java Community Process (JCP JSR-102). The home of JDOM is www.jdom.org.

JDOM was designed specifically for Java. In contrast, DOM is purely an interface speci-fication independent of any language. For example, a Java parser can leverage standard Java types and collections, such as the String class and the Collections API. The goal of W3C DOM is to be language independent, which works but can add a lot of unnecessary complications. Here are some of the guiding principles of JDOM:

JDOM should be straightforward for Java programmers.

JDOM should support easy and efficient document modification.

JDOM should hide the complexities of XML wherever possible, while remaining true to the XML specification.

JDOM should integrate with DOM and SAX.

JDOM should be lightweight and fast.

JDOM should solve 80 percent (or more) of Java/XML problems with 20 percent (or less) of the effort when compare with DOM.

JDOM is a class-based API, whereas DOM is an interface-based API. There are classes that encapsulate documents, elements, attributes, text, and so on. This simplifies usage by minimizing downcasts. DOM is a strict hierarchy based on a node, which leads to lots of downcasts. Downcasts add complexity to source code and also reduce performance.

JDOM does not parse XML by itself; rather, it can build JDOM objects from a DOM tree or a SAX parser. In general, it is more efficient to use JDOM’s SAXBuilder class if all you want to do is read XML from a file or stream.

JDOM Example

Let’s create an XML document using JDOM. The source code for JDOMCreate.java appears in Listing 7.13.

LISTING 7.13 JDOMCreate.java

package com.madhu.xml;

import org.jdom.*;

import org.jdom.output.*;

public class JDOMCreate {

public static void main(String args[]) throws Exception {

Element root = new Element(“library”);

Document doc = new Document(root);

Element fiction = new Element(“fiction”);

Element book = new Element(“book”);

book.setAttribute(“author”, “Herman Melville”);

book.addContent(“Moby Dick”);

fiction.addContent(book);

root.addContent(fiction);

XMLOutputter outputter = new XMLOutputter(“\t”, true);

outputter.output(doc, System.out);

}

Most of the JDOM classes are in the org.jdom package. We only need the org.jdom.output package in order to write the output using XMLOutputter. As adver-tised, JDOM code is very simple. To create a document, all we need to do is create ele-ments, using any of the Element class constructors. Once that is done, we can set attributes and add content. The addContent() method is overridden, so you can add text or elements using the same method. Notice that you must create the Document object given a root element. This is done to make sure the document is always well formed.

Once the object graph representing our document is created, we can write it out to a stream using the XMLOutputter class. In the example, we write the document to System.out. We could write it to a file using FileOutputStream as well. The output appears in Listing 7.14.

LISTING 7.14 JDOMCreate Output

<?xml version=”1.0” encoding=”UTF-8”?> <library>

</fiction>

</library>

Notice the nice formatting of the output. Indenting and new lines make the document look as if it was hand-edited. Formatting can be controlled through constructor parame-ters of the XMLOutputter class. In the example, we specified a Tab character (\t) for indenting and set new lines to true. This can be particularly handy if the XML docu-ments you create are available for human consumption (which they often are).

Reading and parsing an XML document is even easier. As mentioned earlier, JDOM is not meant to be a parser replacement. JDOM uses existing parsers to avoid reinventing the wheel. If you have an existing DOM or SAX parser, you can use it with JDOM. The JDOM distribution includes Apache Xerces, so you can be up and running right away.

The following example parses an XML document and then prints it out using XMLOutputter. The source code for JDOMParse.java appears in Listing 7.15.

LISTING 7.15 JDOMParse.java

package com.madhu.xml;

import java.io.*;

import org.jdom.*; import org.jdom.input.*; import org.jdom.output.*;

public class JDOMParse {

public static void main(String args[]) throws Exception {

SAXBuilder builder = new SAXBuilder();

Document doc = builder.build(new File(args[0]));

XMLOutputter outputter = new XMLOutputter(“\t”, true);

outputter.output(doc, System.out);

}

You can use either a DOM or SAX parser in order to parse a document and produce a JDOM Document object. In practice, SAX parsers tend to be more efficient in terms of memory because the entire document is not read in at once, as is the case with DOM. The SAXBuilder class can build a document given a File object, InputStream, or a number of other sources.

Small DOM-like Implementations

PDAs and cellular phones are rapidly becoming the terminals of choice for people on the run. They are a lot easier to carry compared to a laptop. (Remember the “luggables” of the mid-1980s? We’ve come a long way since then!)

With the availability of Java 2 Micro Edition (J2ME) and the wireless Web, XML is becoming more important on these small devices. If you’re going to work with XML on a PDA, something like DOM is a great help. Of course, a full-blown DOM implementa-tion is too much for a PDA, but there are smaller, simpler alternatives, and you have several solutions from which to choose.

NanoXML

NanoXML is a nonvalidating parser available at http://nanoxml.sourceforge.net. It looks a lot like DOM, but it’s much smaller. Version 2.0 is about 33KB, but a light ver-sion is available that’s less than 6KB! The API contains a class called XMLElement, which is very similar to the Node interface found in DOM.

TinyXML

TinyXML is a nonvalidating parser available at http://www.gibaradunn.srac.org/tiny/index.shtml. It’s primarily for reading in an XML document, because it does not provide facilities to create a document. It’s extremely simple, based primarily on one class, TinyParser, and one interface, ParsedXML. All you need to do is call a static method in TinyParser to parse a stream, file, or URL. This gives you an instance of a ParsedXML interface that has only seven methods. The uncompressed class files are about 16KB.

kXML

kXML is a DOM-like parser in the spirit of JDOM. The primary difference is that it is designed specifically for J2ME resource-constrained devices. kXML can be found at http://www.kxml.org. kXML is probably the most sophisticated of the three small parsers mentioned.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

XML and Web Services : Building XML-Based Applications : Parsing XML Using Document Object Model : DOM Implementations: JDOM, NanoXML, TinyXML, kXML |