Ubiquitous computing is divided into two complementary modes of communication, sim-ply because we have both eyes and ears. On the one hand, remote devices must display a graphical user interface that allows the user to communicate via the Internet. On the other hand, some devices must accept spoken commands and respond audibly—what might be called an audible user interface. In many ways, each mode of communication must be discussed separately. However, because of the requirement of presentation neu-trality, there must be a core technology—based on XML—that’s shared by the various communication modes.
Wireless Services: WAP and WML
The Wireless Application Protocol (WAP) is a specification for the delivery and presenta-tion of information and telephony services via wireless networks on mobile phones, which are also called Web phones, as well as other wireless terminals. The current ver-sion of the WAP specification is 1.2.1, which is overseen by the WAP Forum, and is backed by major industry players, including Openwave, IBM, Sprint, Cingular, Ericsson, Motorola, and Nokia.
WAP 1.2.1 uses the Wireless Markup Language (WML), an XML derivative modeled after HTML, for tagging content for presentation on mobile handsets. The version of WML that corresponds to WAP 1.2.1 is WML1, which, like HTML, suffers from a lack of extensibility. Although WAP has been endorsed by over 90 percent of the world’s handset manufacturers, its adoption has been spotty, due to several problems—inadequate security, low-quality applications, ineffective business models, poor usability, and incon-sistent implementations across different handset models. The upcoming WAP 2.0 specifi-cation, on the other hand, addresses most of the issues with WAP 1.2.1 and WML1.
The WAP 2.0 specification includes the following features:
Direct support for TCP and HTTP. WAP 1.2.1 provided for the Wireless Session Protocol (WSP) and the Wireless Transaction Protocol (WTP), which had to be converted to TCP and HTTP at the WAP gateway. WAP 2.0 allows for HTTP and TCP at the handset and is also backward compatible with WAP 1.2.1.
The Wireless Application Environment (WAE). WAE provides for the interaction between WAP-based Web applications and wireless devices containing a WAP browser (called a microbrowser).
WAP 2.0 addresses the unique characteristics of wireless devices. These include small screens, limited battery life and memory, as well as user interface considera-tions such as one-finger navigation.
The updated WML2 markup language converges with the Extensible Hypertext Markup Language Basic (XHTML Basic). The Compact Hypertext Markup Language (cHTML), the format used for the i-Mode service widely used in Japan, also converges with XHTML Basic. XHTML Basic is a core subset of the
Extensible Hypertext Markup Language (XHTML), which is appropriate for wire-less devices.
WAP 2.0 supports two additional “mobile friendly” technologies. These include the Composite Capabilities/Preference Profiles (CC/PP) framework for describing user preferences and device capabilities as well as the Cascading Style Sheets (CSS) Mobile Profile, which provides a subset of CSS version 2 targeted at mobile devices. (CC/PP can be found at http://www.w3.org/Mobile/CCPP, and the CSS Mobile Profile is located at http://www.w3.org/TR/css-mobile).
Support for WAP Push. This allows content to be sent to devices by server-based applications, allowing applications to send alerts to WAP devices without requiring them to poll the server.
Voice Services: VoiceXML
The other key technology behind ubiquitous computing is VoiceXML. Many of the popu-lar Web services have interfaces that require minimal input and provide concise, high-value text-based output. Voice applications strive to deliver services such as stock quotes, weather, driving reports, and so on over any telephone. Such applications accept both voice and Dual-Tone Multiple Frequency (DTMF, commonly called Touchtone) key-presses for input and synthesized speech or prerecorded audio playback for output. VoiceXML is a XML-based markup language specification that allows Web sites to deliver voice-based services over telephones to users. To access such a service, a user calls a number with his telephone and connects to a voice portal running on a VoiceXML server (refer to Figure 21.1). The voice portal, driven by VoiceXML, in turn interacts over the Internet with the application that delivers the service to the user.
There are several reasons why a company may want to make a Web application accessi-ble via voice:
Voice access is very low cost. Users already have phones and phone service, so there is no initial cost to purchase a device or a special service for this mode of access.
Users are also familiar and comfortable with telephones, and telephones are glob-ally available. One of the most vexing problems facing wireless data services today is availability. By using the telephone as the client device to access voice ser-vices, companies can maximize coverage and availability with minimal additional infrastructure cost.
Voice access enables eyes and hands-free operation. This makes voice access the only suitable choice in many situations, such as driving a car. Wireless PDAs, two-way pagers, and Web phones all require at least one hand and the user’s eyes.
For companies that are already using Interactive Voice Response (IVR) systems, they have the incentive to move to voice portals driven by VoiceXML in order to consolidate all their services, thus reducing costs. Traditional IVR systems are closed, proprietary systems, making it difficult for companies to build presentation-neutral applications.
VoiceXML is a relatively straightforward XML-based language. VoiceXML’s features include the following:
Interaction dialogs, including <menu> and <form>, which provide for user input
Audio output, tagged with <prompt>, which provides either text-to-speech (TTS) or prerecorded audio streams
Audio input, including speech recognition and Touchtone capabilities
Presentation logic, including basic control flow commands as well as ECMAScript client-side scripting
Event handling, including bad input, help, and error conditions
Basic connection control, including call transfer, bridging, and disconnect
Building a VoiceXML application, however, is far more complex than the VoiceXML language itself. Creating a VoiceXML application involves the following steps:
Designing the voice application and developing it with VoiceXML tools.
Tuning the endpoint parameters to improve comprehension and speech quality.
Tuning the grammars and parameters for the Automatic Speech Recognition (ASR) capability, essentially training the application to understand all relevant speech utterances.
Setting up the VoiceXML generator, interpreter, and platform to provide the required availability, scalability, and redundancy to the application.
Establishing a rigorous test suite and conducting thorough quality assurance.
As with wireless services, there’s more to voice services than meets the eye. ASR is the weak link in any voice service, and even with a well-trained ASR system, there are still many steps to creating a functional XML-based interactive voice service. Furthermore, with wireless services, the current WAP specification is sorely lacking in usability and functionality, and even when the WAP 2.0 specification becomes established, developers are still faced with issues of backward compatibility, multiple communication protocols, and a seemingly never-ending variety of handset configurations. So, now that you’ve been suitably warned, let’s proceed with the rest of the chapter.
The rest of the chapter addresses WML and VoiceXML by showing you how to use XSL style sheets to produce WML and VoiceXML from existing XML documents. The trans-formation examples will help you understand how to develop in WML and VoiceXML.