Document Type Definitions
DTD stands for Document Type Definition. A Document Type Definition allows the XML author to define a set of rules for an XML document to make it valid. An XML document is considered “well formed” if that document is syntactically correct according to the syntax rules of XML 1.0. However, that does not mean the document is necessarily valid. In order to be considered valid, an XML document must be validated, or verified, against a DTD. The DTD will define the elements required by an XML document, the elements that are optional, the number of times an element should (could) occur, and the order in which elements should be nested. DTD markup also defines the type of data that will occur in an XML element and the attributes that may be associated with those ele-ments. A document, even if well formed, is not considered valid if it does not follow the rules defined in the DTD.
When an XML document is validated against a DTD by a validating XML parser, the XML document will be checked to ensure that all required elements are present and that no undeclared elements have been added. The hierarchical structure of elements defined in the DTD must be maintained. The values of all attributes will be checked to ensure that they fall within defined guidelines. No undeclared attributes will be allowed and no required attributes may be omitted. In short, every last detail of the XML document from top to bottom will be defined and validated by the DTD.
Although validation is optional, if an XML author is publishing an XML document for which maintaining the structure is vital, the author can reference a DTD from the XML document and use a validating XML parser during processing. Requiring that an XML document be validated against a DTD ensures the integrity of the data structure.
XML documents may be parsed and validated before they are ever loaded by an application. That way, XML data that is not valid can be flagged as “invalid” before it ever gets processed by the application (thus saving a lot of the headaches that corrupt or incomplete data can cause).
Imagine a scenario where data is being exchanged in an XML format between multiple organizations. The integrity of business-to-business data is vital for the smooth function-ing of commerce. There needs to be a way to ensure that the structure of the XML data does not change from organization to organization (thus rendering the data corrupt and useless). A DTD can ensure this.
An extra advantage of using DTDs in this situation is that a single DTD could be refer-enced by all the organization’s applications. The defined structure of the data would be in a centralized resource, which means that any changes to the data structure definition would only need to be implemented in one place. All the applications that referenced the DTD would automatically use the new, updated structure.
A DTD can be internal, residing within the body of a single XML document. It can also be external, referenced by the XML document. A single XML document could even have both a portion (or subset) of its DTD that is internal and a portion that is external. As mentioned in the previous paragraph, a single external DTD can be referenced by many XML documents. Because an external DTD may be referenced by many documents, it is a good repository for global types of definitions (definitions that apply to all documents). An internal DTD is good to use for rules that only apply to that specific document. If a document has both internal and external DTD subsets, the internal rules override the external rules in cases where the same item is defined in both subsets.
Given this brief overview, you can quickly see
why a DTD would be important to appli-cations that exchange data in an XML
format. Before diving into the actual coverage of the structure of DTDs, take a
look at a couple of quick examples. This will give you a better impression of
what we are talking about as we go forward.