![if !IE]> <![endif]>
Rules of XML Structure
We have explored the structure of XML documents, but there are various rules that XML documents must comply with in order for them to be appropriately processed and parsed. Some of these rules enforce the hierarchical, structured nature of XML, whereas others impose restrictions to simplify the task of XML processing for applications.
All XML Elements Must Have a Closing Tag
Even though other markup languages such as HTML allow their markup tags to remain “open” or contain only a beginning element tag, XML requires all tags to be closed. They can be closed by matching a beginning element tag with a closing tag, or they can be closed by the use of empty elements. In either case, no tag may be left unclosed. Listing 2.11 shows this incorrect use of XML.
LISTING 2.11 Incorrect XML Due to Unclosed Tags
<markup>This is not valid XML <markup>Since there is no closing tag
XML Tags Are Case Sensitive
In XML, the use of capitalization is incredibly important. XML elements and attributes are case sensitive. This means that differences in capitalization will be interpreted as dif-ferent elements or attributes. This differs from HTML, where tags are not case sensitive and arbitrary capitalization is allowed. In XML, the elements <shirt> and <Shirt> are as different as <egg> and <house>. Listing 2.12 shows an example of the incorrect matching of element capitalization.
LISTING 2.12 Incorrect XML Due to Capitalization Mismatch
<Markup>These two tags are very different</markup>
All XML Elements Must Have Proper Nesting
Unlike languages such as HTML, XML requires that elements be nested in proper hierar-chical order. Tags must be closed in the reverse order in which they are opened. A proper analogy is to think of XML tags as envelopes. There must never be a case where one envelope is closed when an envelope contained within it is still open. Listing 2.13 shows an incorrect nesting order of XML elements.
LISTING 2.13 Incorrect XML Due to Improper Element Nesting
<oxygen><nitrogen>These tags are improperly nested</oxygen></nitrogen>
All XML Documents Must Contain a
Single Root Element
XML documents must contain a single root element—no less, and certainly no more. All other elements in the XML document are then nested within this root element. Once the root element is defined, any number of child elements can branch off it as long as they follow the other rules mentioned in this section. The root element is the most important one in the document because it contains all the other elements and reflects the document type as declared in the Document Type Declaration. Root elements can be listed only once and not repeated, nor can there be multiple, different root elements. Listing 2.14 illustrates the improper use of root elements.
LISTING 2.14 Incorrect XML Due to Multiple Root Elements
Attribute Values Must Be Quoted
When attributes are used within XML elements, their values must be surrounded by quotes. Although most systems accept single or double quotes for attribute values, it is generally accepted to use double quotes around attribute values. If you need to use the quote literal within an attribute value, you can use the quote entity " or ' to insert the required quote character. Listing 2.15 illustrates the improper use of non-quoted attributes.
LISTING 2.15 Incorrect XML Due to Improper Quoting of Attributes
Attributes May Only Appear Once in the Same Start Tag
Even though attributes may be optional, when they are present, they can only appear once. This simple restriction prevents ambiguity when multiple, conflicting attribute name/value pairs are present. By only allowing a single attribute name/value pair to be present, the system avoids any conflicts or other errors. Listing 2.16 shows the improper use of multiple attributes within a single element.
LISTING 2.16 Incorrect XML Due to Multiple Attribute Names in Start Tag
<shirt size=”large” size=”small”>Zippy Tee</shirt>
Attribute Values Cannot Contain References to External Entities
Although external entities may be allowed for general markup text, attribute values can-not contain references to external entities. However, attribute values can make use of internally defined entities and generally available entities, such as < and ".
All Entities Except amp, lt, gt, apos, and quot Must Be Declared Before They Are Used
Although this goes without saying, entities cannot be used before they are properly declared. Referring to an undeclared entity would obviously result in an XML document that is not well formed and proper. However, there are a number of entities that can be
assumed to be defined by XML processors. So far, these are limited to the entities
<, >, ', and ".
Other Rules of XML Structure
Other rules exist for well-formed XML. For example, binary entities cannot be refer-enced in the general content of an XML document. Rather, these entities can only be used in an attribute declared as ENTITY or ENTITIES. Also, text and parameter entities are not allowed to be directly or indirectly recursive, and the replacement text for all parame-ter entities referenced inside a markup declaration must be complete markup declarations.
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.