Object-Relational Features: Object Database Extensions to SQL
We introduced SQL as the standard language for RDBMSs in Chapters 4 and 5. As we discussed, SQL was first specified by Chamberlin and Boyce (1974) and under-went enhancements and standardization in 1989 and 1992. The language continued its evolution with a new standard, initially called SQL3 while being developed, and later known as SQL:99 for the parts of SQL3 that were approved into the standard. Starting with the version of SQL known as SQL3, features from object databases were incorporated into the SQL standard. At first, these extensions were known as SQL/Object, but later they were incorporated in the main part of SQL, known as SQL/Foundation. We will use that latest standard, SQL:2008, in our presentation of the object features of SQL, even though this may not yet have been realized in commercial DBMSs that follow SQL. We will also discuss how the object features of SQL evolved to their latest manifestation in SQL:2008.
The relational model with object database enhancements is sometimes referred to as the object-relational model. Additional revisions were made to SQL in 2003 and 2006 to add features related to XML (see Chapter 12).
The following are some of the object database features that have been included in SQL:
Some type constructors have been added to specify complex objects. These include the row type, which corresponds to the tuple (or struct) constructor. An array type for specifying collections is also provided. Other collection type constructors, such as set, list, and bag constructors, were not part of the original SQL/Object specifications but were later included in the standard.
A mechanism for specifying object identity through the use of reference type is included.
Encapsulation of operations is provided through the mechanism of user-defined types (UDTs) that may include operations as part of their declaration. These are somewhat similar to the concept of abstract data types that were developed in programming languages. In addition, the concept of user-defined routines (UDRs) allows the definition of general methods (operations).
Inheritance mechanisms are provided using the keyword UNDER.
We now discuss each of these concepts in more detail. In our discussion, we will refer to the example in Figure 11.4.
1. User-Defined Types and Complex Structures for Objects
To allow the creation of complex-structured objects, and to separate the declaration of a type from the creation of a table, SQL now provides user-defined types (UDTs). In addition, four collection types have been included to allow for multivalued types and attributes in order to specify complex-structured objects rather than just simple (flat) records. The user will create the UDTs for a particular application as part of the database schema. A UDT may be specified in its simplest form using the following syntax:
CREATE TYPE TYPE_NAME AS (<component declarations>);
Figure 11.4 illustrates some of the object concepts in SQL. We will explain the exam-ples in this figure gradually as we explain the concepts. First, a UDT can be used as either the type for an attribute or as the type for a table. By using a UDT as the type for an attribute within another UDT, a complex structure for objects (tuples) in a table can be created, much like that achieved by nesting type constructors. This is similar to using the struct type constructor of Section 11.1.3. For example, in Figure 11.4(a), the UDT STREET_ADDR_TYPE is used as the type for the STREET_ADDR attribute in the UDT USA_ADDR_TYPE. Similarly, the UDT USA_ADDR_TYPE is in turn used as the type for the ADDR attribute in the UDT PERSON_TYPE in Figure 11.4(b). If a UDT does not have any operations, as in the examples in Figure 11.4(a), it is possible to use the concept of ROW TYPE to directly create a structured attribute
To allow for collection types in order to create complex-structured
objects, four constructors are now included in SQL: ARRAY, MULTISET, LIST, and SET. These are similar to the type constructors discussed in Section
11.1.3. In the initial specifica-tion of SQL/Object, only the ARRAY type was specified, since it can be used to sim-ulate the other types,
but the three additional collection types were included in the latest version
of the SQL standard. In Figure 11.4(b), the PHONES
attribute of PERSON_TYPE
has as its type an array whose elements are of the
UDT USA_PHONE_TYPE. This
array has a maximum of four elements, meaning that we can store up to four
phone numbers per person. An array can also have no maximum number of elements
An array type can have its elements referenced using the common notation of square brackets. For example, PHONES refers to the first location value in a PHONES attribute (see Figure 11.4(b)). A built-in function CARDINALITY can return the cur-rent number of elements in an array (or any other collection type). For example, PHONES[CARDINALITY (PHONES)] refers to the last element in the array.
The commonly used dot notation is used to refer to components of a ROW TYPE or a UDT. For example, ADDR.CITY refers to the CITY component of an ADDR attribute (see Figure 11.4(b)).
2. Object Identifiers Using Reference Types
Unique system-generated object identifiers can be created via the reference type in the latest version of SQL. For example, in Figure 11.4(b), the phrase:
REF IS SYSTEM GENERATED
indicates that whenever a new PERSON_TYPE object is created, the system will assign it a unique system-generated identifier. It is also possible not to have a system-generated object identifier and use the traditional keys of the basic relational model if desired.
In general, the user can specify that system-generated object identifiers for the individual rows in a table should be created. By using the syntax:
REF IS <OID_ATTRIBUTE> <VALUE_GENERATION_METHOD> ;
the user declares that the attribute named <OID_ATTRIBUTE> will be used to identify individual tuples in the table. The options for <VALUE_GENERATION _METHOD> are SYSTEM GENERATED or DERIVED. In the former case, the system will automatically generate a unique identifier for each tuple. In the latter case, the traditional method of using the user-provided primary key value to identify tuples is applied.
3. Creating Tables Based on the UDTs
For each UDT that is specified to be instantiable via the phrase INSTANTIABLE (see Figure 11.4(b)), one or more tables may be created. This is illustrated in Figure 11.4(d), where we create a table PERSON based on the PERSON_TYPE UDT. Notice that the UDTs in Figure 11.4(a) are noninstantiable, and hence can only be used as types for attributes, but not as a basis for table creation. In Figure 11.4(b), the attrib-ute PERSON_ID will hold the system-generated object identifier whenever a new PERSON record (object) is created and inserted in the table.
4. Encapsulation of Operations
In SQL, a user-defined type can have its own behavioral specification by specifying methods (or operations) in addition to the attributes. The general form of a UDT specification with methods is as follows:
CREATE TYPE <TYPE-NAME> (
<LIST OF COMPONENT ATTRIBUTES AND THEIR TYPES>
<DECLARATION OF FUNCTIONS (METHODS)>
For example, in Figure 11.4(b), we declared a method Age() that calculates the age of an individual object of type PERSON_TYPE.
The code for implementing the method still has to be written. We can refer to the method implementation by specifying the file that contains the code for the method, or we can write the actual code within the type declaration itself (see Figure 11.4(b)).
SQL provides certain built-in functions for user-defined types. For a UDT called TYPE_T, the constructor function TYPE_T( ) returns a new object of that type. In the new UDT object, every attribute is initialized to its default value. An observer function A is implicitly created for each attribute A to read its value. Hence, A(X) or X.A returns the value of attribute A of TYPE_T if X is of type TYPE_T. A mutator function for updating an attribute sets the value of the attribute to a new value. SQL allows these functions to be blocked from public use; an EXECUTE privilege is needed to have access to these functions.
In general, a UDT can have a number of user-defined functions associated with it. The syntax is
INSTANCE METHOD <NAME> (<ARGUMENT_LIST>) RETURNS
Two types of functions can be defined: internal SQL and external. Internal functions are written in the extended PSM language of SQL (see Chapter 13). External functions are written in a host language, with only their signature (interface) appearing in the UDT definition. An external function definition can be declared as follows:
DECLARE EXTERNAL <FUNCTION_NAME> <SIGNATURE>
Attributes and functions in UDTs are divided into three categories:
PUBLIC (visible at the UDT interface)
PRIVATE (not visible at the UDT interface)
PROTECTED (visible only to subtypes)
It is also possible to define virtual attributes as part of UDTs, which are computed and updated using functions.
5. Specifying Inheritance and Overloading of Functions
Recall that we already discussed many of the principles of inheritance in Section 11.1.5. SQL has rules for dealing with type inheritance (specified via the UNDER keyword). In general, both attributes and instance methods (operations) are inherited. The phrase NOT FINAL must be included in a UDT if subtypes are allowed to be created under that UDT (see Figure 11.4(a) and (b), where PERSON_TYPE,
STUDENT_TYPE, and EMPLOYEE_TYPE are declared to be NOT FINAL). Associated with type inheritance are the rules for overloading of function implementations and for resolution of function names. These inheritance rules can be summarized as follows:
All attributes are inherited.
The order of supertypes in the UNDER clause determines the inheritance hierarchy.
An instance of a subtype can be used in every context in which a supertype instance is used.
A subtype can redefine any function that is defined in its supertype, with the restriction that the signature be the same.
When a function is called, the best match is selected based on the types of all arguments.
For dynamic linking, the runtime types of parameters is considered.
Consider the following examples to illustrate type inheritance, which are illustrated in Figure 11.4(c). Suppose that we want to create two subtypes of PERSON_TYPE: EMPLOYEE_TYPE and STUDENT_TYPE. In addition, we also create a subtype MANAGER_TYPE that inherits all the attributes (and methods) of EMPLOYEE_TYPE but has an additional attribute DEPT_MANAGED. These subtypes are shown in Figure 11.4(c).
In general, we specify the local attributes and any additional specific methods for the subtype, which inherits the attributes and operations of its supertype.
Another facility in SQL is table inheritance via the supertable/subtable facility. This is also specified using the keyword UNDER (see Figure 11.4(d)). Here, a new record that is inserted into a subtable, say the MANAGER table, is also inserted into its supertables EMPLOYEE and PERSON. Notice that when a record is inserted in MANAGER, we must provide values for all its inherited attributes. INSERT, DELETE, and UPDATE operations are appropriately propagated.
6. Specifying Relationships via Reference
A component attribute of one tuple may be a reference (specified using the key-word REF) to a tuple of another (or possibly the same) table. An example is shown in Figure 11.4(e).
The keyword SCOPE specifies the name of the table whose tuples can be referenced by the reference attribute. Notice that this is similar to a foreign key, except that the system-generated value is used rather than the primary key value.
SQL uses a dot notation to build path expressions that refer to the component attributes of tuples and row types. However, for an attribute whose type is REF, the dereferencing symbol –> is used. For example, the query below retrieves employees working in the company named ‘ABCXYZ’ by querying the EMPLOYMENT table:
FROM EMPLOYMENT AS E
WHERE E.Company–>COMP_NAME = ‘ABCXYZ’;
In SQL, –> is used for dereferencing and has the same meaning assigned to it in the C programming language. Thus, if r is a reference to a tuple and a is a component attribute in that tuple, then r –> a is the value of attribute a in that tuple.
several relations of the same type exist, SQL provides the SCOPE keyword by which a
reference attribute may be made to point to a tuple within a specific table of