Chapter: XML and Web Services : The Semantic Web : RDF for Information Owners

The RDF Family of Specifications

Now that you understand the basic idea behind RDF, let’s look at the family of RDF specifications in detail, noting their differing authority as well as where the interests of information owners are impacted. When we’re done, you’ll understand the maturity of the different parts of the RDF specification.

The RDF Family of Specifications

Core Specifications

RDF builds on two companion specifications. The model and syntax specification defines the triple in which RDF statements are made; the schema specification describes how to use RDF to build RDF vocabularies (collections of resources that can be used as predicates—the verbs in RDF statements).

Recent Working Drafts and Notes

RDF is a very dynamic set of specifications, in part because of W3C’s working draft/can-didate recommendation/recommendation publication cycle, which encourages midcourse corrections based on implementation experience. Indeed, although institutionally W3C may be likened to a cathedral, in action (at least, in RDF) it may seem that the bazaar development model prevails, with all manner of goods on show or openly spilling amid a cacophony of raised voices: logicians, priests, and so on. In the sidebar, you will see the places to go to keep on top of RDF as it evolves.

Now, why are all these specifications of anything other than academic interest? In a word, interoperability.

RDF is not about my semantic site, my semantic department, or even my semantic enter-prise. It is about the Semantic Web: It is a general solution for making statements that all machines (not just some machines) and all humans (not just some humans) can under-stand. Just as the strength of HTML is that it is simple and can be displayed anywhere, RDF’s strength is that it can be (or should be) understood everywhere. It is (or should be) a lingua franca.

If RDF statements are not interoperable—that is, if they are not understood in the same way by all processors—then it’s hard to see how the Semantic Web can come to be. Suppose that two processors have different understandings of a statement about a drug dosage? Or a statement in an aircraft repair manual? Or, if you are an information owner, your data? Mars Explorer crashed because one processor thought a measurement was in metric units, and a second processor thought the same measurement was in English units. On the Semantic Web, the impact of interoperability failure could come as lethal drug dosages, crashed airplanes, or corrupted data.

First, let’s take a look at URIs, URLs, and URNs. What is not addressed by URIs, URLs, and URNs is more critical than what is. The issue: Whether it is okay for URIs not to identify resources that can be retrieved over a network (for example, the person Jane). This issue is categorized as “unresolved.”

Therefore, RDF’s use of URIs, and the broader Web’s use of URIs, may not be interoper-able. Therefore, there is some uncertainty about whether the following application areas are in scope for RDF, because they would depend on URIs for resources that cannot be retrieved over a network:

Government archives on physical media (for example, “reel/frame” numbers at the United States Patent and Trademark Office)

Legal citations to volume, reporter, and page

Warehouse applications

Help lines (where the resource is a human’s expertise)

Disembodied concepts

Next, Refactoring RDF/XML Syntax raises further interoperability concerns. It summa-rizes the effects of reports from implementers. As it turns out, different RDF implemen-tations, given the same markup, generate different graphs (instances of the RDF data model). Hence, the specification is ambiguous.

RDF Model Theory addresses interoperability concerns. It is an effort to enhance RDF’s precision by respecifying the RDF data model using techniques for defining the seman-tics of statements that are more precise than the text of the existing specifications.

RDF Test Cases provide a way of testing and possibly allaying interoperability concerns. It is a draft set of machine-processable test cases corresponding to technical issues addressed by the [RDF] WG, again based on W3C’s issues-tracking document, to which we now turn.

RDF Issue Tracking categorizes open issues in two ways:

Under consideration

Not yet under consideration

Here is a small selection of the issues that are under consideration as of this writing.

rdfs-xml-schema-datatypes. The RDF schema spec should consider using XML Schema data types in examples and/or in some formal specification of the mapping of these data types into the RDF model.

rdfms-literal-is-xml-structure. A literal containing XML markup should be treated as markup.

Here is a small selection of the issues that are on the list to be considered, but are not yet being considered:

rdfms-resource-semantics. What is a resource? How do resources relate to other concepts such as URI and entity?

rdfms-identity-of-statements. Does the RDF model allow more than one statement with the same triple of subject/predicate/object?

rdf-equivalent-representations. RDFMS employs several syntactic representations when describing the RDF abstract model. Are they truly equivalent?

Most of these issues raise interoperability concerns for owners of RDF information. For example, rdfms-formal-grammar, by making the description of the data model and its XML representation more rigorous, should have the effect of making implementations more consistent. Similarly for rdfs-xml-schema-datatype: Why should RDF’s integers or dates not be interoperable with XML schema’s? rdfms-resource-semantics raises the lack of consistency between URIs, URLs, and URNs and RFC2396, noted earlier.

Finally, those philosophical bugbears—identity and equivalence—are wakened from hibernation by rdfms-identity-of-statements and rdf-equivalent-representations. For the first issue, the question is whether “Subject has an object” and “Subject has an object” are two statements, when processed by an RDF engine, or one. For the second, you will see shortly that there are several ways to represent the RDF data model in syntax. Are these representations truly equivalent? How could we be certain? Here again, these ques-tions raise interoperability issues, because different RDF implementations could make different assumptions on these points.

Making the Case for RDF Investment

Finally, the $64,000 question: Assuming that avoiding reconversion of RDF data is a requirement, when should information owners and developers feel comfortable in making significant investments in RDF implementations?

This will depend on individual cases, of course. However, some general guidelines can be laid down. First, the interoperability issues of significance to the information owner and potential clients should be closed out at W3C. Monitor the RDF Issue Tracking site on this point. Second, there should be some W3C-recommended declarative specification for mechanically checking the validity of RDF instances—for example, a W3C XML schema or XML DTD. Third, there should be test cases for checking RDF processors. Monitor the RDF Test Cases site on this point. This site has test cases only for technical issues, not a test suite for RDF processing in general. If W3C does not create such a suite, perhaps some institution such as the National Institute for Standards (NIST) will.

The fundamental RDF value proposition (conversations) remains unaffected by any con-cern raised by these glances at the innards of the W3C issues-tracking process. You now have the background to assess the RDF data model in detail.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

XML and Web Services : The Semantic Web : RDF for Information Owners : The RDF Family of Specifications |