Introduction
to XML Syntax
Every language and document
representation format needs to have a goal. The “goal” of a document’s format
gives it meaning and a long-term direction. After all, it is not possi-ble for
a single data-representation format to be used for all possible data needs. The
goal of the Microsoft Word format is to represent a word-processing document;
the goal of the Microsoft Excel format is to represent a spreadsheet of
numerical information. Although it is possible to use Excel to represent a
word-processing document and Word to encode numerical data, these are not the
“intended uses” of these document formats. Continuing development of the
formats will not make Word a better spreadsheet or Excel a better word
processor. It’s like fitting a square peg in a round hole.
So, what is the goal of XML
and its intended use? We have spent a chapter talking about how XML can be used
to encode any structured information, but the one-size-fits-all document format
simply doesn’t exist. XML is good at representing information that has an extensible, hierarchical format and requires encoding of metadata. These three con-cepts form the basis of the XML
language’s structure and data model.
Markup
Languages and Self-Describing Data
One of the early design goals
of XML was that it should be fairly easy to create XML documents using standard
text editors and widely available editing tools. This is actually a legacy of
the SGML and HTML languages, which are also text based. These languages use
“markup” in order to encode metadata in a text format. The main concept behind
markup languages is that they use special text strings to surround data or text
with infor-mation that indicates the context, meaning, or interpretation of the
data encapsulated in the markup. In effect, markup languages in general and XML
in particular really contain only two kinds of information: markup elements and
actual data that are given meaning by these markup elements. Together, these
are known as XML text.
Markup text has a couple
rules. First, it needs to be differentiated from the rest of the document text
by a unique set of characters that delineates where the markup informa-tion
starts and ends. These special characters are known as delimiters. The XML lan-guage has four special delimiters, as
outlined in Table 2.1.
In XML, angle brackets
(less-than and greater-than signs) are used to delimit an XML “tag,” and the
ampersand and semicolon characters delimit “entity” information. Tags are a
unit of information that we will refer to later when we start talking about XML
ele-ments, and entities provide another way of encoding specific information
within an XML document.
Of course, the data contained
within the delimiting characters is where all the informa-tion lies. Because
XML is a plain-text language, markup tags can actually indicate what
information is being described. This is actually a major feature of XML and
similar lan-guages—namely, the ability for the XML document to self-describe
what it is talking about. The following example in Listing 2.1 shows a simple
XML document that demon-strates the self-describing property of XML.
It is clear from this example
that the markup tag is talking about someone’s first name, and the encapsulated
text is the actual first name. The power of a self-describing lan-guage is
tremendous. It simplifies document creation, maintenance, and debugging. This
also makes it easier to communicate with others who may not have prior
knowledge of a document’s contents. Of course, the big drawback of such
languages is that they take up a lot of space. But nowadays, disk space and
memory are plentiful and cheap.
A
Simple XML Document
Throughout this chapter, we
will refer to a simple XML document to demonstrate the various portions of an
XML document and how it is structured. In this case, we’ll talk about a shirt.
There’s actually a lot we can talk about with regard to a shirt: size, color,
fabric, price, brand, and condition, among other properties. Listing 2.2 shows
one possi-ble XML rendition of a document describing a shirt. Of course, there
are many other possible ways to describe a shirt, but this example provides a
foundation for our further discussions.
LISTING 2.2 A Simple
XML document
<?xml version=”1.0”?>
<shirt>
<model>Zippy
Tee</model>
<brand>Tommy Hilbunger</brand>
<price
currency=”USD”>14.99</price>
<on_sale/>
<fabric
content=”60%”>cotton</fabric>
<fabric
content=”40%”>polyester</fabric>
<options>
<colorOptions>
<color>red</color>
<color>white</color>
</colorOptions>
<sizeOptions>
<size>Medium</size>
<size>Large</size>
</sizeOptions>
</options>
<description>
This is a
<b>funky</b> Tee shirt similar to the Floppy Tee shirt
</description>
</shirt>
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.