XML: Extensible Markup Language
Many electronic commerce (e-commerce) and other Internet applications provide Web inter-faces to access information stored in one or more databases. These databases are often referred to as data sources. It is common to use two-tier and three-tier client/server architectures for Internet applications (see Section 2.5). In some cases, other variations of the client/server model are used. E-commerce and other Internet database applications are designed to interact with the user through Web interfaces that display Web pages. The common method of specifying the contents and for-matting of Web pages is through the use of hypertext documents. There are various languages for writing these documents, the most common being HTML (HyperText Markup Language). Although HTML is widely used for formatting and structuring Web documents, it is not suitable for specifying structured data that is extracted from databases. A new language—namely, XML (Extensible Markup Language)—has emerged as the standard for structuring and exchanging data over the Web. XML can be used to provide information about the structure and meaning of the data in the Web pages rather than just specifying how the Web pages are formatted for dis-play on the screen. The formatting aspects are specified separately—for example, by using a formatting language such as XSL (Extensible Stylesheet Language) or a transformation language such as XSLT (Extensible Stylesheet Language for Transformations or simply XSL Transformations). Recently, XML has also been proposed as a possible model for data storage and retrieval, although only a few experimental database systems based on XML have been developed so far.
Basic HTML is useful for generating static Web pages with fixed text and other objects, but most e-commerce applications require Web pages that provide interactive features with the user. For example, consider the case of an airline customer who wants to check the arrival time and gate information of a particular flight. The user may enter information such as a date and flight number in certain form fields of the Web page. The Web program must first submit a query to the airline database to retrieve this information, and then display it. Such Web pages, where part of the information is extracted from databases or other data sources are called dynamic Web pages, because the data extracted and displayed each time will be for different flights and dates.
this chapter, we will focus on describing the XML data model and its associated
languages, and how data extracted from relational databases can be formatted as
XML documents to be exchanged over the Web. Section 12.1 discusses the
difference between structured, semistructured, and unstructured data. Section
12.2 presents the XML data model, which is based on tree (hierarchical)
structures as compared to the flat relational data model structures. In Section
12.3, we focus on the structure of XML documents, and the languages for
specifying the structure of these documents such as DTD (Document Type
Definition) and XML Schema. Section 12.4 shows the relationship between XML and
relational databases. Section 12.5 describes some of the languages associated
with XML, such as XPath and XQuery. Section 12.6 discusses how data extracted
from relational databases can be formatted as XML documents. Finally, Section
12.7 is the chapter summary.