SAX vs. DOM
As you know, DOM is an in-memory tree structure of an XML document or document fragment. DOM is a natural object model of an XML document, but it’s not always practical. Large documents can take up a lot of memory. This is overkill if all you want to do is find a small piece of data in a very large document.
SAX is, in many ways, much simpler than DOM. There is no need to model every possi-ble type of object that can be found in an XML document. This makes the API easy to understand and easier to use. DOM contains many interfaces, each containing many methods. SAX is comprised of a handful of classes and interfaces. SAX is a much lower-level API when compared with DOM. For these reasons, SAX parsers tend to be smaller than DOM implementations. In fact, many DOM implementations use SAX parsers under the hood to read in XML documents.
SAX is an event-based API. Instead of loading an entire document into memory all at once, SAX parsers read documents and notify a client program when elements, text, comments, and other data of interest are found. SAX parsers send you events continu-ously, telling you what was found next.
The DOM parses XML in space, whereas SAX parses XML in time. In essence, the DOM parser hands you an entire document and allows you to traverse it any way you like. This can take a lot of memory, so SAX can be significantly more efficient for large documents. In fact, you can process documents larger than available system memory, but this is not possible with DOM. SAX can also be faster, because you don’t have to wait for the entire document to be loaded. This is especially valuable when reading data over a network.
In some cases, you might want to build your own object model of an XML document because DOM might not describe your specific document efficiently or in the way you would like. You could solve the problem by loading a document using DOM and translat-ing the DOM object model into your own object model. However, this can be very ineffi-cient, so SAX is often a better solution.