Schematron
Most of the schema definition
languages we have explored to this point have been based on grammatical
structures. Now it’s time for a drastic change in direction and concept. A new
schema definition language titled Schematron has been introduced that has
changed the way of thinking about schemas entirely: Rather than basing them on
some grammati-cal structure, Schematron uses patterns to define schemas. By
using patterns, Schematron allows schema authors to represent various
structures that would otherwise be difficult to accomplish in a more
traditional grammar-based schema definition language. By basing its definition
language on XPath and XSLT, Schematron’s learning curve drops sharply compared
to other schema definition languages. For more information on Schematron, visit
http://www.ascc.net/xml/resource/schematron/.
The general idea behind
Schematron is to find a node set, typically elements, using XPath expressions
and check the node set against some other XPath expressions to see whether they
are true. A nice feature of the Schematron schema definition language is that
you can actually embed Schematron schemas inside the XML Schema Definition
Language’s <appinfo> element.
Currently in version 1.5, Schematron schemas
may be created using what are termed assertions, rules, patterns, and phases.
Assertions within a Schematron schema are
simple declarative statements contained within an <assert> or <report> element. The statement
within an <assert> element is one that is expected to be true for an XML document
conforming to the schema being defined. A statement within the <report> element, however, is one
that is expected to be false for an XML document conforming to the schema. So,
to create an assertion statement you wish to show up when an element does not
have, say, a particular child element, you’d use an <assert> element, saying something
like “Element A must have an Element B.” Alternatively, you could use a <report> element with a statement
saying something like “Element B is missing from Element A.”
Each of the elements used for
assertions within a Schematron schema make use of an attribute called test. This attribute contains an
XSLT pattern, which may combine one or more XPath expressions using the or operator (|) to specify a condition that
the asser-tion must meet. In addition, each element may also contain the
following three elements:
<name>
<emph>
<span>
The <name> element, when appearing in
the statement for an assertion, is used to indicate that the name of the
context node should be inserted at the location where the <name> element is. This removes the
need to know the exact name of an element or elements for which an assertion will fail or
hold true. Also, you may optionally specify a path attribute that contains an XPath expression to
locate a specific node within the doc-ument, allowing a different element or
attribute to be used instead of the context node. The <emph> element has been provided to
allow for better formatting control so that ele-ments within the assertion
statement can have the same formatting as those within the <name> element. The <span> element performs exactly the
same function as the <span> element within HTML.
Within a Schematron schema, a
rule can be specified by using a <rule> element, which can contain both <assert> and <report> elements. The <rule> element itself has a context attribute that contains an
XPath expression used to identify when the assertions contained within the rule
should be tested. The combination of the <rule>, <assert> and <report> elements is the core behind the Schematron
schema definition language.
Rules are grouped together
using patterns, indicated by the <pattern> element. This <pattern> element is the nearest equivalent to a type.
Patterns may contain one or more <rule> elements and may also contain a variety of
attributes, including the following:
name
id
fpi
see
The name attribute allows you to
specify text that can be easily read by humans, whereas the id attribute assigns a unique
ID to the <pattern> element. The fpi element, which stands for Formal
Public Identifier, allows an SGML Formal Public Identifier to be attached
to the <pattern> element. The see attribute allows you to specify a URL that would give more
documentation regarding the tests.
Now that you have a general
understanding of the elements that comprise a Schematron schema, let’s look at
an example. Listing 6.10 shows a sample Schematron schema for the sample XML
document in Listing 6.1.
LISTING 6.10 PurchaseOrder.xst Contains
a Sample Schematron Schema for
PurchaseOrder.xml
<schema>
<pattern name=”Sample”>
<rule context=”PurchaseOrder”>
<assert test=”@Tax”>The
<name/> element must have a ➥ <emph>Tax</emph>
attribute.</assert>
<assert
test=”@Total”>The <name/> element must have a ➥ <emph>Total</emph>
attribute.</assert>
</rule>
<rule context=”ShippingInformation”>
<assert test=”Name”>The
<name/> element must have a ➥
<emph>Name</emph> element.</assert>
<assert
test=”Address”>The <name/> element must have an ➥ <emph>Address</emph>
element.</assert>
<assert
test=”Method”>The <name/> element must have a ➥ <emph>Method</emph>
element.</assert>
<assert
test=”DeliveryDate”>The <name/> element must have a ➥ <emph>DeliveryDate</emph>
element.</assert>
</rule>
<rule context=”BillingInformation”>
<assert test=”Name”>The
<name/> element must have a ➥
<emph>Name</emph> element.</assert>
<assert test=”Address”>The <name/>
element must have an ➥
<emph>Address</emph> element.</assert>
<assert
test=”PaymentMethod”>The <name/> element must have a ➥ <emph>PaymentMethod</emph>
element.</assert>
<assert
test=”BillingDate”>The <name/> element must have a ➥ <emph>BillingDate</emph>
element.</assert>
</rule>
<rule context=”Address”>
<assert
test=”Street”>The <name/> element must have a ➥ <emph>Street</emph>
element.</assert>
<assert test=”City”>The
<name/> element must have a ➥
<emph>City</emph> element.</assert>
<assert
test=”State”>The <name/> element must have a ➥ <emph>State</emph>
element.</assert>
<assert test=”Zip”>The
<name/> element must have a ➥ <emph>Zip</emph>
element.</assert>
</rule>
<rule context=”Order”>
<assert
test=”@SubTotal”>The <name/> element must have a ➥ <emph>SubTotal</emph>
attribute.</assert>
<assert
test=”@ItemsSold”>The <name/> element must have a ➥ <emph>ItemsSold</emph>
attribute.</assert>
<assert
test=”Product”>The <name/> element must have a ➥ <emph>Product</emph>
element.</assert>
</rule>
<rule context=”Product”>
<assert
test=”@Name”>The <name/> element must have a ➥ <emph>Name</emph>
attribute.</assert>
<assert test=”@Id”>The
<name/> element must have a ➥ <emph>Id</emph>
attribute.</assert>
<assert
test=”@Price”>The <name/> element must have a ➥ <emph>Price</emph>
attribute.</assert>
<assert
test=”@Quantity”>The <name/> element must have a ➥ <emph>Quantity</emph>
attribute.</assert>
</rule>
</pattern>
</schema>
As you can tell from the code
in Listing 6.10, there is a dramatic difference in complex-ity between it and
the schema listed in Listing 6.2. Using the Schematron definition lan-guage, we
have been able to efficiently describe the rules by which an XML document can
be verified against conformance in a fraction of the complexity of the formal XML
Schema Definition Language. Plus, now that we can actually see the schema
created using the Schematron definition language, we can easily see how
effective the idea of basing the schema on patterns can be compared with the
very rigid and structured gram-mar-based method.
Alternatively, the schema in
Listing 6.10 could be written as shown in Listing 6.11 to create messages that
would indicate when an element or attribute is in compliance.
LISTING 6.11 PurchaseOrder2.xst Contains
a Sample Schematron Schema for
PurchaseOrder.xml
<schema>
<pattern name=”Sample”>
<rule context=”PurchaseOrder”>
<report test=”@Tax”>The
<name/> element has a ➥ <emph>Tax</emph>
attribute.</report>
<report
test=”@Total”>The <name/> element has a ➥
<emph>Total</emph> attribute.</report>
</rule>
<rule context=”ShippingInformation”>
<report test=”Name”>The
<name/> element has a ➥
<emph>Name</emph> element.</report>
<report
test=”Address”>The <name/> element has an ➥ <emph>Address</emph>
element.</report>
<report
test=”Method”>The <name/> element has a ➥
<emph>Method</emph> element.</report>
<report
test=”DeliveryDate”>The <name/> element has a ➥ <emph>DeliveryDate</emph>
element.</report>
</rule>
<rule context=”BillingInformation”>
<report test=”Name”>The
<name/> element has a ➥
<emph>Name</emph> element.</report>
<report
test=”Address”>The <name/> element has an ➥ <emph>Address</emph>
element.</report>
<report test=”PaymentMethod”>The
<name/> element has a ➥
<emph>PaymentMethod</emph> element.</report>
<report
test=”BillingDate”>The <name/> element has a ➥ <emph>BillingDate</emph>
element.</report>
</rule>
<rule context=”Address”>
<report test=”Street”>The
<name/> element has a ➥
<emph>Street</emph> element.</report>
<report test=”City”>The
<name/> element has a ➥
<emph>City</emph> element.</report>
<report
test=”State”>The <name/> element has a ➥
<emph>State</emph> element.</report>
<report test=”Zip”>The
<name/> element has a ➥ <emph>Zip</emph>
element.</report>
</rule>
<rule context=”Order”>
<report
test=”@SubTotal”>The <name/> element has a ➥ <emph>SubTotal</emph>
attribute.</report>
<report
test=”@ItemsSold”>The <name/> element has a ➥ <emph>ItemsSold</emph>
attribute.</report>
<report
test=”Product”>The <name/> element has a ➥ <emph>Product</emph>
element.</report>
</rule>
<rule context=”Product”>
<report
test=”@Name”>The <name/> element has a ➥
<emph>Name</emph> attribute.</report>
<report test=”@Id”>The
<name/> element has a ➥ <emph>Id</emph>
attribute.</report>
<report
test=”@Price”>The <name/> element has a ➥
<emph>Price</emph> attribute.</report>
<report
test=”@Quantity”>The <name/> element has a ➥ <emph>Quantity</emph> attribute.</report>
</rule>
</pattern>
</schema>
<report
test=”DeliveryDate”>The <name/> element has a ➥ <emph>DeliveryDate</emph>
element.</report>
</rule>
<rule context=”BillingInformation”>
<assert test=”Name”>The
<name/> element must have a ➥ <emph>Name</emph>
element.</assert>
<assert
test=”Address”>The <name/> element must have an ➥ <emph>Address</emph>
element.</assert>
<assert
test=”PaymentMethod”>The <name/> element must have a ➥ <emph>PaymentMethod</emph>
element.</assert>
<assert test=”BillingDate”>The
<name/> element must have a ➥
<emph>BillingDate</emph> element.</assert>
<report test=”Name”>The
<name/> element has a ➥
<emph>Name</emph> element.</report>
<report
test=”Address”>The <name/> element has an ➥ <emph>Address</emph>
element.</report>
<report
test=”PaymentMethod”>The <name/> element has a ➥ <emph>PaymentMethod</emph>
element.</report>
<report
test=”BillingDate”>The <name/> element has a ➥ <emph>BillingDate</emph>
element.</report>
</rule>
<rule context=”Address”>
<assert test=”Street”>The
<name/> element must have a ➥
<emph>Street</emph> element.</assert>
<assert test=”City”>The
<name/> element must have a ➥
<emph>City</emph> element.</assert>
<assert
test=”State”>The <name/> element must have a ➥ <emph>State</emph> element.</assert>
<assert test=”Zip”>The
<name/> element must have a ➥ <emph>Zip</emph>
element.</assert>
<report
test=”Street”>The <name/> element has a ➥
<emph>Street</emph> element.</report>
<report test=”City”>The <name/>
element has a ➥
<emph>City</emph> element.</report>
<report
test=”State”>The <name/> element has a ➥
<emph>State</emph> element.</report>
<report test=”Zip”>The
<name/> element has a ➥ <emph>Zip</emph>
element.</report>
</rule>
<rule context=”Order”>
<assert
test=”@SubTotal”>The <name/> element must have a ➥ <emph>SubTotal</emph>
attribute.</assert>
<assert
test=”@ItemsSold”>The <name/> element must have a ➥ <emph>ItemsSold</emph>
attribute.</assert>
<assert
test=”Product”>The <name/> element must have a ➥ <emph>Product</emph> element.</assert>
<report
test=”@SubTotal”>The <name/> element has a ➥ <emph>SubTotal</emph>
attribute.</report>
<report
test=”@ItemsSold”>The <name/> element has a ➥ <emph>ItemsSold</emph>
attribute.</report>
<report
test=”Product”>The <name/> element has a ➥ <emph>Product</emph>
element.</report>
</rule>
<rule context=”Product”>
<assert
test=”@Name”>The <name/> element must have a ➥ <emph>Name</emph>
attribute.</assert>
<assert test=”@Id”>The
<name/> element must have a ➥ <emph>Id</emph>
attribute.</assert>
<assert
test=”@Price”>The <name/> element must have a ➥ <emph>Price</emph>
attribute.</assert>
<assert
test=”@Quantity”>The <name/> element must have a ➥ <emph>Quantity</emph>
attribute.</assert>
<report
test=”@Name”>The <name/> element has a ➥ <emph>Name</emph>
attribute.</report>
<report test=”@Id”>The
<name/> element has a ➥ <emph>Id</emph>
attribute.</report>
<report test=”@Price”>The <name/>
element has a ➥
<emph>Price</emph> attribute.</report>
<report
test=”@Quantity”>The <name/> element has a ➥ <emph>Quantity</emph>
attribute.</report>
</rule>
</pattern>
</schema>
In the schema shown in
Listing 6.12, an XML instance document would be evaluated and a very detailed
report of the level of its conformity could be generated by virtue of having
both <assert> and <report> elements within it. This
allows applications to test for either condition depending on what sort of
process is being attempted. But why would you ever want to use a pattern-based
schema versus a grammar-based one? Well, imagine our sample Purchase Order XML
document in Listing 6.1. Using Schematron, we could create a pattern that says
the sum of all the prices times the quantities of the <Product> elements within the <Order> element must equal the value
of the SubTotal attribute on the <Order> element.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2024 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.