The Object Query Language OQL
The object query language OQL is the query language proposed for the
ODMG object model. It is designed to work closely with the programming
languages for which an ODMG binding is defined, such as C++, Smalltalk, and
Java. Hence, an OQL query embedded into one of these programming languages can
return objects that match the type system of that language. Additionally, the
implementations of class operations in an ODMG schema can have their code
written in these programming languages. The OQL syntax for queries is similar
to the syntax of the relational standard query language SQL, with additional
features for ODMG concepts, such as object identity, complex objects,
operations, inheritance, polymorphism, and relationships.
In Section 11.5.1 we will discuss the syntax of simple OQL queries and
the concept of using named objects or extents as database entry points. Then,
in Section 11.5.2 we will discuss the structure of query results and the use of
path expressions to traverse relationships among objects. Other OQL features
for handling object identity, inheritance, polymorphism, and other
object-oriented concepts are discussed in Section 11.5.3. The examples to
illustrate OQL queries are based on the UNIVERSITY database
schema given in Figure 11.10.
1. Simple OQL Queries, Database Entry Points, and Iterator Variables
The basic OQL syntax is a select ... from ... where ... structure, as it is for SQL. For example, the query to retrieve the
names of all departments in the college of ‘Engineering’ can be written as
follows:
Q0: select D.Dname
from D in DEPARTMENTS
where D.College = ‘Engineering’;
In general, an entry point to
the database is needed for each query, which can be any named persistent object. For many queries, the entry point is the
name of the extent of a class. Recall that the extent name is considered to be the name of
a persistent object whose type is a collection (in most cases, a set) of objects from the class. Looking at the extent names in Figure
11.10, the named object DEPARTMENTS is of type set<DEPARTMENT>; PERSONS is of type set<PERSON>; FACULTY is of type set<FACULTY>; and so on.
The use of an extent name—DEPARTMENTS in Q0—as an entry point refers to a persistent collection of objects.
Whenever a collection is referenced in an OQL query, we should define an iterator variable—D in Q0—that ranges over each object in the collection. In many cases, as in Q0, the query will select certain objects from the collection, based on
the conditions specified in the where clause.
In Q0, only persistent objects D in
the collection of DEPARTMENTS that satisfy the condition D.College = ‘Engineering’ are selected for the query result. For each selected
object D, the value of D.Dname is retrieved
in the query result. Hence, the type of
the result for Q0 is bag<string> because the type of each Dname value is string (even though the actual result
is a set because Dname is a key attribute). In general, the result of a query would be of type
bag for select
... from ... and of type set for select distinct
... from ... , as in SQL (adding the keyword distinct eliminates duplicates).
Using the example in Q0, there are three syntactic
options for specifying iterator variables:
D in DEPARTMENTS
DEPARTMENTS D
DEPARTMENTS AS D
We will use the first construct in our examples.
The named objects used as database entry points for OQL queries are not
limited to the names of extents. Any named persistent object, whether it refers
to an atomic (single) object or to a collection object, can be used as a
database entry point.
2. Query Results and
Path Expressions
In general, the result of a query can be of any type that can be
expressed in the ODMG object model. A query does not have to follow the select ... from ... where
...
structure; in the simplest case, any persistent name on its own is a
query, whose result is a reference to that persistent object. For example, the
query
Q1: DEPARTMENTS;
returns a reference to the collection of all persistent DEPARTMENT objects, whose type is set<DEPARTMENT>. Similarly, suppose we had given (via the database bind operation,
see Figure 11.8) a persistent name CS_DEPARTMENT to a
single DEPARTMENT
object (the Computer Science department); then, the
query
Q1A: CS_DEPARTMENT;
returns a reference to that individual object of type DEPARTMENT. Once an entry point is specified, the concept of a path expression can be used to specify
a path to related attributes and
objects. A path expression typically starts at a persistent object name, or
at the iterator variable that ranges over individual objects in a collection. This name will be followed by zero or
more relationship names or attribute names connected using the dot notation. For example, referring to
the UNIVERSITY data-base in Figure 11.10, the following are examples of path
expressions, which are also valid queries in OQL:
Q2: CS_DEPARTMENT.Chair;
Q2A: CS_DEPARTMENT.Chair.Rank;
The first expression Q2 returns an object of type FACULTY, because that is the type of the attribute Chair of the DEPARTMENT class. This will be a reference to the FACULTY object
that is related to the DEPARTMENT object whose persistent name is CS_DEPARTMENT via the attribute Chair; that is, a reference to the FACULTY object
who is chairperson of the Computer Science
department. The second expression Q2A is
similar, except that it returns the Rank of this FACULTY object (the Computer Science chair) rather than the
object reference; hence, the type returned by Q2A is string, which is the data type for the Rank attribute
of the
FACULTY class.
Path expressions Q2 and Q2A return single values, because the attributes Chair (of DEPARTMENT) and
Rank (of FACULTY) are both single-valued and they
are applied
to a single object. The third expression, Q2B, is different; it returns an object of type set<FACULTY> even when applied to a single object, because that is the type of
the relationship Has_faculty of the DEPARTMENT class. The collection returned will include references to all FACULTY objects that are related to the DEPARTMENT object
whose persistent name is CS_DEPARTMENT via the relationship Has_faculty; that is, references to all FACULTY objects
who are working in the Computer Science department. Now, to return the ranks
of Computer Science faculty, we cannot
write
Q3 : CS_DEPARTMENT.Has_faculty.Rank;
because it is not clear whether the object returned would be of type set<string> or bag<string> (the latter being more likely, since multiple faculty may share the
same rank). Because of this type of ambiguity problem, OQL does not allow
expressions such as Q3 . Rather, one must use an
iterator variable over any collections, as in Q3A or Q3B below:
Q3A: select F.Rank
from F in CS_DEPARTMENT.Has_faculty;
Q3B: select distinct F.Rank
from F in CS_DEPARTMENT.Has_faculty;
Here, Q3A returns bag<string> (duplicate rank values appear in the result), whereas Q3B returns
set<string>
(duplicates are eliminated via the distinct keyword).
Both Q3A and
Q3B illustrate how an iterator variable can be defined
in the
from clause to range
over a restricted collection specified in the query. The variable F in Q3A and Q3B ranges over the elements of the collection CS_DEPARTMENT.Has_faculty, which
is of type set<FACULTY>, and includes only those faculty who are members of the Computer
Science department.
In general, an OQL query can return a result with a complex structure
specified in the query itself by utilizing the struct keyword.
Consider the following examples:
Q4: CS_DEPARTMENT.Chair.Advises;
Q4A: select struct (
name: struct (last_name: S.name.Lname, first_name: S.name.Fname),
degrees:(
select struct (deg: D.Degree,
yr: D.Year,
college: D.College)
from D in S.Degrees ))
from S in
CS_DEPARTMENT.Chair.Advises;
Here, Q4 is straightforward, returning an object of type set<GRAD_STUDENT> as its result; this is the collection of graduate students who are
advised by the chair of the Computer Science department. Now, suppose that a
query is needed to retrieve the last and first names of these graduate
students, plus the list of previous degrees of each. This can be written as in Q4A, where the variable S ranges
over the collec-tion of graduate students advised by the chairperson, and the
variable D ranges over the degrees of
each such student S. The type of the
result of Q4A is a collection of (first-level) structs where
each struct has two components: name and degrees.
The name component is a further struct made up of last_name and first_name, each being a single string. The degrees component is defined by an
embedded query and is itself a collection of further (second level) structs,
each with three string compo-nents: deg, yr, and college.
Note that OQL is orthogonal
with respect to specifying path expressions. That is, attributes, relationships,
and operation names (methods) can be used interchange-ably within the path
expressions, as long as the type system of OQL is not compro-mised. For
example, one can write the following queries to retrieve the grade point
average of all senior students majoring in Computer Science, with the result
ordered by GPA, and within that by last and first name:
Q5A: select struct ( last_name:
S.name.Lname, first_name: S.name.Fname,
gpa: S.gpa )
from S in CS_DEPARTMENT.Has_majors
where S.Class = ‘senior’
order by gpa desc, last_name asc, first_name asc;
Q5B: select struct ( last_name:
S.name.Lname, first_name: S.name.Fname,
gpa: S.gpa )
from S in STUDENTS
where S.Majors_in.Dname = ‘Computer Science’
and
S.Class =
‘senior’
order by gpa desc, last_name asc, first_name asc;
Q5A used the named entry point CS_DEPARTMENT to
directly locate the reference to the Computer Science
department and then locate the students via the relation-ship Has_majors, whereas Q5B searches the STUDENTS extent to locate all students
majoring in that department. Notice how attribute names, relationship names,
and operation (method) names are all used interchangeably (in an orthogonal
manner) in the path expressions: gpa is an
operation; Majors_in and Has_majors are relation-ships; and Class, Name, Dname, Lname, and Fname are attributes. The implementa-tion of the gpa operation computes the grade point average and returns its value as a
float type for each selected STUDENT.
The order
by clause is similar to the corresponding SQL
construct, and specifies in which order the query result is to be displayed.
Hence, the collection returned by a query with an order by clause is of type list.
3. Other Features of
OQL
Specifying Views as Named Queries. The view
mechanism in OQL uses the concept
of a named query. The define keyword is used to specify an
identifier of the named query, which must be a unique name among all named
objects, class names, method names, and function names in the schema. If the
identifier has the same name as an existing named query, then the new
definition replaces the previous def-inition. Once defined, a query definition
is persistent until it is redefined or deleted. A view can also have parameters
(arguments) in its definition.
For
example, the following view V1 defines
a named query Has_minors to
retrieve the set of objects for students minoring in a given department:
V1: define Has_minors(Dept_name) as
select S
from S in STUDENTS
where S.Minors_in.Dname = Dept_name;
Because
the ODL schema in Figure 11.10 only provided a unidirectional Minors_in attribute for a STUDENT, we can use the above view to
represent its inverse without having to explicitly define a relationship. This
type of view can be used to represent inverse relationships that are not expected
to be used frequently. The user can now utilize the above view to write queries
such as
Has_minors(‘Computer Science’);
which
would return a bag of students minoring in the Computer Science depart-ment.
Note that in Figure 11.10, we defined Has_majors as an
explicit relationship, presumably because it is expected to be used more often.
Extracting Single Elements from Singleton
Collections. An OQL query will, in
general, return a collection as its
result, such as a bag, set (if distinct is specified), or list (if the order by clause is used). If the user
requires that a query only return a sin-gle element, there is an element operator in OQL that is
guaranteed to return a sin-gle element E
from a singleton collection C that
contains only one element. If C
contains more than one element or if C
is empty, then the element operator raises
an exception. For example, Q6 returns the single object
reference to the Computer Science
department:
Q6: element
( select D
from D in DEPARTMENTS
where D.Dname = ‘Computer Science’ );
Since a
department name is unique across all departments, the result should be one
department. The type of the result is D:DEPARTMENT.
Collection Operators (Aggregate Functions,
Quantifiers). Because many query
expressions specify collections as their result, a number of operators have
been defined that are applied to such collections. These include aggregate
operators as well as membership and quantification (universal and existential)
over a collection.
The
aggregate operators (min, max, count, sum, avg) operate over a collection. The
operator count returns
an integer type. The remaining aggregate operators (min, max, sum, avg) return the same type as the
type of the operand collection. Two examples
follow.
The query Q7 returns
the number of students minoring in Computer Science and Q8 returns the average GPA of all
seniors majoring in Computer Science.
Q7: count ( S in Has_minors(‘Computer Science’));
Q8: avg ( select S.Gpa
from S in STUDENTS
where S.Majors_in.Dname = ‘Computer Science’ and S.Class = ‘Senior’);
Notice
that aggregate operations can be applied to any collection of the appropriate
type and can be used in any part of a query. For example, the query to retrieve
all department names that have more than 100 majors can be written as in Q9:
Q9: select D.Dname
from D in DEPARTMENTS
where count (D.Has_majors) >
100;
The membership and quantification expressions return a Boolean type—that is, true or
false. Let V be a variable, C a collection expression, B an expression of type Boolean (that
is, a Boolean condition), and E an
element of the type of elements in collection C. Then:
(E in C) returns true if element E is a member of collection C.
(for all V in C : B)
returns true if all the elements of
collection C satisfy B. (exists V in C : B)
returns true if there is at least one element in C satisfying B.
To
illustrate the membership condition, suppose we want to retrieve the names of
all students who completed the course called ‘Database Systems I’. This can be
writ-ten as in Q10, where
the nested query returns the collection of course names that each STUDENT S has completed, and the membership condition returns true if
‘Database Systems I’ is in the collection for a particular STUDENT S:
Q10: select S.name.Lname, S.name.Fname from S in STUDENTS
where ‘Database
Systems I’ in
select C.Section.Of_course.Cname
from C in S.Completed_sections);
Q10 also illustrates a simpler way to
specify the select clause of
queries that return a collection
of structs; the type returned by Q10 is bag<struct(string, string)>.
One can
also write queries that return true/false results. As an example, let us assume
that there is a named object called JEREMY of type STUDENT. Then, query Q11 answers the following question: Is Jeremy
a Computer Science minor? Similarly,
Q12 answers the question Are all
Computer Science graduate students advised by Computer Science faculty? Both Q11 and Q12 return true or false, which are
inter-preted as yes or no answers to the above questions:
Q11: JEREMY in Has_minors(‘Computer
Science’);
Q12: for all G
in
select S
fromS in GRAD_STUDENTS
where S.Majors_in.Dname
=
‘Computer Science’ )
: G.Advisor
in CS_DEPARTMENT.Has_faculty;
Note that
query Q12 also illustrates how attribute,
relationship, and operation inheritance applies to queries. Although S is an iterator that ranges over the extent GRAD_STUDENTS, we can write
S.Majors_in because the Majors_in relationship is inherited by GRAD_STUDENT from STUDENT via extends (see Figure 11.10). Finally, to
illustrate the exists
quantifier, query Q13 answers
the following question: Does any graduate Computer Science major have a
4.0 GPA? Here, again, the operation gpa is
inherited by GRAD_STUDENT from STUDENT via extends.
Q13: exists G in
select S
from S
in GRAD_STUDENTS
where S.Majors_in.Dname
=
‘Computer Science’ )
: G.Gpa = 4;
Ordered (Indexed) Collection Expressions. As we
discussed in Section 11.3.3, collections
that are lists and arrays have additional operations, such as retrieving the ith, first, and last elements.
Additionally, operations exist for extracting a subcollec-tion and
concatenating two lists. Hence, query expressions that involve lists or arrays
can invoke these operations. We will illustrate a few of these operations using
sample queries. Q14
retrieves the last name of the faculty member who earns the highest salary:
Q14:
first ( select struct(facname: F.name.Lname, salary: F.Salary)
from F in FACULTY
order by salary desc );
Q14 illustrates the use of the first operator on a list collection
that contains the salaries
of faculty members sorted in descending order by salary. Thus, the first
ele-ment in this sorted list contains the faculty member with the highest
salary. This query assumes that only one faculty member earns the maximum salary.
The next query, Q15,
retrieves the top three Computer Science majors based on GPA.
Q15: ( select struct(
last_name: S.name.Lname, first_name: S.name.Fname,
gpa: S.Gpa )
from S in CS_DEPARTMENT.Has_majors
order by gpa desc ) [0:2];
The select-from-order-by query
returns a list of Computer Science students ordered by GPA in descending order.
The first element of an ordered collection has an index position of 0, so the
expression [0:2] returns
a list containing the first, second, and third elements of the select ... from ... order by ... result.
The Grouping Operator. The group by clause in OQL, although similar
to the corresponding clause in SQL,
provides explicit reference to the collection of objects within each group or partition. First we give an example, and then we describe the
general form of these queries.
Q16 retrieves the number of majors in
each department. In this query, the students
are
grouped into the same partition (group) if they have the same major; that is,
the
same
value for S.Majors_in.Dname:
Q16: ( select struct( dept_name,
number_of_majors: count (partition) )
from S in STUDENTS
group by dept_name: S.Majors_in.Dname;
The
result of the grouping specification is of type set<struct(dept_name: string, parti-tion: bag<struct(S:STUDENT>)>),
which contains a struct for each group (partition) that has two components: the
grouping attribute value (dept_name) and the
bag of the STUDENT objects
in the group (partition). The select clause returns the grouping
attribute (name of the department), and a count of the number of elements in
each partition (that is, the number of students in each department), where partition is the keyword used to refer to
each partition. The result type of the select clause
is set<struct(dept_name: string, number_of_majors:
integer)>. In
general, the syntax for the group by clause is
group by F1: E1, F2: E2, ..., Fk: Ek
where F1: E1, F2: E2, ..., Fk: Ek is a list of partitioning (grouping) attributes and
each partitioning attribute specification Fi: Ei defines an attribute (field) name Fi and an expression Ei. The result of applying
the grouping (specified in the group by clause) is a set of structures:
set<struct(F1: T1, F2: T2, ..., Fk: Tk, partition: bag<B>)>
where Ti is the type returned by
the expression Ei, partition is a distinguished field name (a
keyword), and B is a structure whose
fields are the iterator variables (S
in Q16) declared in the from clause having the appropriate
type.
Just as
in SQL, a having clause
can be used to filter the partitioned sets (that is, select only some of the
groups based on group conditions). In Q17, the
previous query is modified to illustrate the having clause (and also shows the simplified syntax for the select clause). Q17 retrieves for each department
having more than 100 majors, the
average
GPA of its majors. The having clause
in Q17 selects only those partitions
(groups) that have more than 100 elements (that is, departments with more than
100 students).
Q17: select dept_name, avg_gpa: avg ( select P.gpa
from P in partition) from S in STUDENTS
group by dept_name: S.Majors_in.Dname
having count (partition) > 100;
Note that
the select clause of Q17 returns the
average GPA of the students in the partition. The expression
select P.Gpa from P in partition
returns a
bag of student GPAs for that partition. The from clause
declares an iterator variable P over
the partition collection, which is of type bag<struct(S: STUDENT)>.
Then the path expression P.gpa is used to access the GPA of
each student in the partition.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.