Home | | Database Management Systems | Translating SQL Queries into Relational Algebra

# Translating SQL Queries into Relational Algebra

In practice, SQL is the query language that is used in most commercial RDBMSs.

Translating SQL Queries into Relational Algebra

In practice, SQL is the query language that is used in most commercial RDBMSs. An SQL query is first translated into an equivalent extended relational algebra expression—represented as a query tree data structure—that is then optimized. Typically, SQL queries are decomposed into query blocks, which form the basic units that can be translated into the algebraic operators and optimized. A query block contains a single SELECT-FROM-WHERE expression, as well as GROUP BY and HAVING clauses if these are part of the block. Hence, nested queries within a query are identified as separate query blocks. Because SQL includes aggregate operators—such as MAX, MIN, SUM, and COUNT—these operators must also be included in the extended algebra, as we discussed in Section 6.4.

Consider the following SQL query on the EMPLOYEE relation in Figure 3.5:

SELECT  Lname, Fname

FROM                             EMPLOYEE

WHERE                           Salary > ( SELECT    MAX (Salary)

FROM                                                    EMPLOYEE

WHERE                                                  Dno=5 );

This query retrieves the names of employees (from any department in the company) who earn a salary that is greater than the highest salary in department 5. The query includes a nested subquery and hence would be decomposed into two blocks. The inner block is:

SELECT  MAX (Salary)

FROMEMPLOYEE

WHERE  Dno=5 )

This retrieves the highest salary in department 5. The outer query block is:

SELECT                           Lname, Fname

FROM                             EMPLOYEE

WHERE                           Salary > c

where c represents the result returned from the inner block. The inner block could be translated into the following extended relational algebra expression:

MAX Salary(σDno=5(EMPLOYEE))

and the outer block into the expression:

πLname,Fname(σSalary>c(EMPLOYEE))

The query optimizer would then choose an execution plan for each query block. Notice that in the above example, the inner block needs to be evaluated only once to produce the maximum salary of employees in department 5, which is then used—as the constant c—by the outer block. We called this a nested query (without correlation with the outer query) in Section 5.1.2. It is much harder to optimize the more complex correlated nested queries (see Section 5.1.3), where a tuple variable from the outer query block appears in the WHERE-clause of the inner query block.

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail
Fundamentals of Database Systems : Query Processing and Optimization, and Database Tuning : Algorithms for Query Processing and Optimization : Translating SQL Queries into Relational Algebra |