Print, Media, and Entertainment
The pervasiveness, applicability, and extensibility of XML has even impacted the fairly innocuous arena of general entertainment. Playing games, watching movies, and general entertainment is made even more enjoyable and intelligently enabled by XML technology.
The news industry is dominated by one thing: content. In fact, there really is no separa-tion of news from content, and as such the issues around content management are really the same as the issues around the creation and distribution of news. In the past, editorial environments would produce content to support various news products, which would require the content to be tailored to each format. Where there is data, especially docu-ment and structured data such what’s present in the news industry, there is XML. In fact, there’s a plethora of news- and content-related specifications that are squarely targeted at solving the needs of this space. In particular, the NewsML format, created initially by Reuters and supported by the International Press Telecommunications Council (IPTC), is a specification created for the definition, creation, exchange, and packaging of news arti-cles and related content. NewsML further compliments and extends another IPTC stan-dards effort, the News Industry Text Format (NITF), which specifies the content of news articles. Once you have the kind of rich format that NewsML provides, you can build news products for different user groups without creating lots of the reengineering needed for mixing different blends of news. Typical uses of NewsML include uses in and among editorial systems, between news agencies and their customers, between publishers and news aggregators, and between news service providers and end users.
The main functionality of NewsML falls along the following areas: providing neutrality of news format and media type, easier development of news items, collections of news items into larger news “stories,” named relationships between news items, divisions of news stories into structures consisting of parts and named relationships between parts, alternative representations of those parts, explicit inclusion, inclusion by reference and exclusion of parts and alternatives, and attachment of metadata from standard and non-standard schemes. In addition, NewsML provides for strong versioning support, support for multiple display methods, and adaptation to delivery environments.
As such, NewsML can be considered to be a “container” for news items. As the NewsML Web site states, “NewsML makes no assumption about the media type, format, or encoding of news. NewsML provides a structure within which news objects, of what-ever type, relate to each other. NewsML can equally represent text, video, audio, graph-ics, and photos. NewsML takes the view that any medium can be the main part of a news item and that objects of all other types can fulfill secondary, tertiary, and other roles in respect of the main part. Hence, NewsML allows for the representation of simple textual stories, textual stories with primary and secondary photos, the evening TV news—with embedded individual reports, and so on.” An architecture diagram of the NewsML format is shown in Figure 22.7.
Because news stories develop over time, NewsML supports versioning and allows for the development of textual stories using takes. In addition, NewsML supports the attachment of components of news stories that can be available later to existing news story compo-nents. Another major feature of NewsML is the collection of news elements into a greater “story” that contains a variety of components that have the same “journalistic intent.” To support this capability, NewsML allows the construction of relationships between news items and collections of news items, such as “see also,” “related news,” and “for more detail,” so that these entities can exist in a web of such named relation-ships. The NewsML format also supports the authentication and signature of metadata and news item content because the value of news content, and its associated metadata, is highly dependent on its reliability.
The architecture of a NewsML document consists of components and named relation-ships between components. Most news items contain a “main” part and some number of secondary and tertiary parts that complement the main part in various ways. This could take the form of a textual main part and photos as secondary parts. In addition, news items themselves can be related to other news items so that a news item can be a compo-nent of another, and individual component can be represented in different ways so that users can select which version they wish to use or is most appropriate to their delivery environment. For example, part of a news item might be available in HTML, RTF, and PDF versions, with photos available at different resolutions and color depths utilizing the GIF or JPEG file format. This methodology also allows news items to be transmitted in print, on the Web, or over wireless delivery protocols because NewsML doesn’t describe layout semantics. Each part of a news item and the news item as a whole can contain metadata that describes physical properties of the parts, information about the construc-tion of the parts, such as author, publisher, and owner, and information about the content, such as the topic, category, and importance. Although NewsML provides the facility to describe news items, it doesn’t specify any particular vocabulary for doing so and thus allows individual organizations to choose their metadata format.
NewsML can add as much information as needed for defining context that individuals can use to better locate and make use of news items. NewsML also gives users the opportunity to receive and aggregate news items from different vendors with similar metadata. Although the packaging features of NewsML are usable internally to produce what users might see on a Web page, it’s the metadata that allows users to link stories with their real meanings.
NewsML is a document format and not a messaging protocol, so it can be delivered using other messaging or content-management messaging schemes such as SOAP, RSS, and ICE. An example of a NewsML file can be found in Listing 2.11.
LISTING 22.11 Sample NewsML File
<?xml version=”1.0” encoding=”UTF-8”?>
<!DOCTYPE NewsML PUBLIC “urn:newsml:iptc.org:20001006:NewsMLv1.0:1” “./DTD/NewsMLv1.0.dtd”>
<?xml-stylesheet type=”text/xsl” href=”./stylesheets/IPTCNewsML.xsl”?>
<Catalog Href=”./catalog/mycatalog.xml”/> <NewsEnvelope>
<RevisionId PreviousRevision=”0” Update=”N”>1</RevisionId> <PublicIdentifier>
<NewsItemType FormalName=”News” Scheme=”NewsItemType”/> <FirstCreated>20001006</FirstCreated> <ThisRevisionCreated>20001006</ThisRevisionCreated> <Status FormalName=”Usable” Scheme=”IptcStatus”/>