Home | | Service Oriented Architecture | Introduction to XML Syntax

Chapter: XML and Web Services : Essentials of XML : The Fundamentals of XML

Introduction to XML Syntax

Markup Languages and Self-Describing Data, A Simple XML Document

Introduction to XML Syntax

 

Every language and document representation format needs to have a goal. The “goal” of a document’s format gives it meaning and a long-term direction. After all, it is not possi-ble for a single data-representation format to be used for all possible data needs. The goal of the Microsoft Word format is to represent a word-processing document; the goal of the Microsoft Excel format is to represent a spreadsheet of numerical information. Although it is possible to use Excel to represent a word-processing document and Word to encode numerical data, these are not the “intended uses” of these document formats. Continuing development of the formats will not make Word a better spreadsheet or Excel a better word processor. It’s like fitting a square peg in a round hole.

So, what is the goal of XML and its intended use? We have spent a chapter talking about how XML can be used to encode any structured information, but the one-size-fits-all document format simply doesn’t exist. XML is good at representing information that has an extensible, hierarchical format and requires encoding of metadata. These three con-cepts form the basis of the XML language’s structure and data model.

Markup Languages and Self-Describing Data

 

One of the early design goals of XML was that it should be fairly easy to create XML documents using standard text editors and widely available editing tools. This is actually a legacy of the SGML and HTML languages, which are also text based. These languages use “markup” in order to encode metadata in a text format. The main concept behind markup languages is that they use special text strings to surround data or text with infor-mation that indicates the context, meaning, or interpretation of the data encapsulated in the markup. In effect, markup languages in general and XML in particular really contain only two kinds of information: markup elements and actual data that are given meaning by these markup elements. Together, these are known as XML text.

Markup text has a couple rules. First, it needs to be differentiated from the rest of the document text by a unique set of characters that delineates where the markup informa-tion starts and ends. These special characters are known as delimiters. The XML lan-guage has four special delimiters, as outlined in Table 2.1.


In XML, angle brackets (less-than and greater-than signs) are used to delimit an XML “tag,” and the ampersand and semicolon characters delimit “entity” information. Tags are a unit of information that we will refer to later when we start talking about XML ele-ments, and entities provide another way of encoding specific information within an XML document.

 

Of course, the data contained within the delimiting characters is where all the informa-tion lies. Because XML is a plain-text language, markup tags can actually indicate what information is being described. This is actually a major feature of XML and similar lan-guages—namely, the ability for the XML document to self-describe what it is talking about. The following example in Listing 2.1 shows a simple XML document that demon-strates the self-describing property of XML.


It is clear from this example that the markup tag is talking about someone’s first name, and the encapsulated text is the actual first name. The power of a self-describing lan-guage is tremendous. It simplifies document creation, maintenance, and debugging. This also makes it easier to communicate with others who may not have prior knowledge of a document’s contents. Of course, the big drawback of such languages is that they take up a lot of space. But nowadays, disk space and memory are plentiful and cheap.

 

A Simple XML Document

 

Throughout this chapter, we will refer to a simple XML document to demonstrate the various portions of an XML document and how it is structured. In this case, we’ll talk about a shirt. There’s actually a lot we can talk about with regard to a shirt: size, color, fabric, price, brand, and condition, among other properties. Listing 2.2 shows one possi-ble XML rendition of a document describing a shirt. Of course, there are many other possible ways to describe a shirt, but this example provides a foundation for our further discussions.

LISTING 2.2 A Simple XML document

 

<?xml  version=”1.0”?>

 

<shirt>

 

<model>Zippy Tee</model>

 <brand>Tommy Hilbunger</brand>

<price currency=”USD”>14.99</price>

<on_sale/>

 

<fabric content=”60%”>cotton</fabric>

<fabric content=”40%”>polyester</fabric>

<options>

 

<colorOptions>

 

<color>red</color>

 

<color>white</color>

 

</colorOptions>

 

<sizeOptions>

 

<size>Medium</size>

 

<size>Large</size>

 

</sizeOptions>

 

</options>

 

<description>

 

This is a <b>funky</b> Tee shirt similar to the Floppy Tee shirt

 

</description>

 

</shirt>

 


Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail
XML and Web Services : Essentials of XML : The Fundamentals of XML : Introduction to XML Syntax |


Privacy Policy, Terms and Conditions, DMCA Policy and Compliant

Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.