Algorithms for Relational Database Schema Design
We now give three algorithms for creating a relational decomposition
from a uni-versal relation. Each algorithm has specific properties, as we
discuss next.
1. Dependency-Preserving
Decomposition into 3NF Schemas
Algorithm
16.4 creates a dependency-preserving decomposition D = {R1, R2, ..., Rm} of a universal relation R based on a set of functional dependencies F, such that each Ri in D is in 3NF. It guarantees only the dependency-preserving property;
it does not guarantee the nonadditive
join property. The first step of Algorithm 16.4 is to find a minimal cover G for F; Algorithm 16.2 can be used for this step. Note that multiple
minimal covers may exist for a given set F
(as we illustrate later in the example after Algorithm 16.4). In such cases the
algorithms can potentially yield multiple alternative designs.
Algorithm 16.4. Relational Synthesis into 3NF
with Dependency Preservation
Input: A universal relation R and a set of functional dependencies F on the attributes of R.
Find a minimal cover G for F (use Algorithm
16.2);
For each left-hand-side X of a functional dependency that appears in G, cre-ate a relation schema in D
with attributes {X ∪ {A1}
∪ {A2}
... ∪ {Ak}
}, where X → A1, X → A2, ..., X → Ak are the only dependencies in G with X as the
left-hand-side (X is the key of this
relation);
Place any remaining attributes (that have not been
placed in any relation) in a single relation schema to ensure the attribute
preservation property.
Example of Algorithm 16.4. Consider
the following universal relation:
U(Emp_ssn, Pno, Esal, Ephone, Dno, Pname, Plocation)
Emp_ssn, Esal, Ephone refer to the Social Security
number, salary, and phone number
of the
employee. Pno, Pname, and Plocation refer to the number, name, and
location of the project. Dno is
department number.
The
following dependencies are present:
FD1: Emp_ssn → {Esal, Ephone, Dno}
FD2: Pno → { Pname, Plocation}
FD3: Emp_ssn, Pno → {Esal, Ephone, Dno, Pname, Plocation}
By virtue
of FD3, the attribute set {Emp_ssn, Pno} represents a key of the
universal relation. Hence F, the set
of given FDs includes {Emp_ssn → Esal, Ephone, Dno;
Pno → Pname, Plocation; Emp_ssn, Pno → Esal, Ephone, Dno, Pname, Plocation}.
By
applying the minimal cover Algorithm 16.2, in step 3 we see that Pno is a redun-dant attribute in Emp_ssn, Pno → Esal, Ephone, Dno. Moreover, Emp_ssn is redun-dant in Emp_ssn, Pno → Pname, Plocation. Hence
the minimal cover consists of FD1 and FD2 only (FD3 being completely redundant)
as follows (if we group attributes with the same left-hand side into one FD):
Minimal
cover G: {Emp_ssn → Esal, Ephone, Dno; Pno → Pname, Plocation}
By
applying Algorithm 16.4 to the above Minimal cover G, we get a 3NF design con-sisting of two relations with keys Emp_ssn and Pno as follows:
R1 (Emp_ssn, Esal, Ephone, Dno)
R2 (Pno, Pname, Plocation)
An
observant reader would notice easily that these two relations have lost the
original information contained in the key of the universal relation U (namely, that there are certain
employees working on certain projects in a many-to-many relationship). Thus,
while the algorithm does preserve the original dependencies, it makes no
guarantee of preserving all of the information. Hence, the resulting design is
a lossy design.
Claim 3. Every relation schema created by
Algorithm 16.4 is in 3NF. (We will not
provide a formal proof here;6 the proof depends on G being a minimal set of dependencies.)
It is
obvious that all the dependencies in G
are preserved by the algorithm because each dependency appears in one of the
relations Ri in the
decomposition D. Since G is equivalent to F, all the dependencies in F
are either preserved directly in the decomposition or are derivable using the
inference rules from Section 16.1.1 from those in the resulting relations, thus
ensuring the dependency preservation prop-erty. Algorithm 16.4 is called a relational synthesis algorithm, because
each relation schema Ri
in the decomposition is synthesized (constructed) from the set of functional
dependencies in G with the same
left-hand-side X.
2. Nonadditive Join Decomposition into BCNF Schemas
The next
algorithm decomposes a universal relation schema R = {A1, A2, ..., An} into a decomposition D = {R1, R2, ..., Rm} such that each Ri
is in BCNF and the decomposition D has the lossless join property with
respect to F. Algorithm 16.5 utilizes
Property NJB and Claim 2 (preservation of nonadditivity in successive
decompositions) to create a nonadditive join decomposition D = {R1,
R2, ..., Rm} of a universal relation R based on a set of functional
dependencies F, such that each Ri in D is in BCNF.
Algorithm 16.5. Relational Decomposition into
BCNF with Nonadditive Join
Property
Input: A universal relation R and a set of functional dependencies F on the attributes of R.
Set D :=
{R} ;
While there is a relation schema Q in D
that is not in BCNF do
{
choose a
relation schema Q in D that is not in BCNF;
find a
functional dependency X → Y in Q that violates BCNF; replace Q in D
by two relation schemas (Q – Y) and (X ∪ Y);
} ;
Each time
through the loop in Algorithm 16.5, we decompose one relation schema Q that is not in BCNF into two relation
schemas. According to Property NJB for binary
decompositions and Claim 2, the decomposition D has the nonadditive join property. At the end of the algorithm,
all relation schemas in D will be in
BCNF. The reader can check that the normalization example in Figures 15.12 and
15.13 basi-cally follows this algorithm. The functional dependencies FD3, FD4,
and later FD5 violate BCNF, so the LOTS relation
is decomposed appropriately into BCNF rela-tions, and the decomposition then satisfies
the nonadditive join property. Similarly, if we apply the algorithm to the TEACH relation schema from Figure
15.14, it is decomposed into TEACH1(Instructor, Student) and TEACH2(Instructor, Course) because the dependency FD2 Instructor → Course violates BCNF.
In step 2
of Algorithm 16.5, it is necessary to determine whether a relation schema Q is in BCNF or not. One method for
doing this is to test, for each functional
dependency X → Y in Q, whether X+ fails to include all the
attributes in Q, thereby determining
whether or not X is a (super)key in Q. Another technique is based on an
observation that whenever a relation schema Q
has a BCNF violation, there exists a pair of attributes A and B in Q such that {Q – {A, B} } → A; by computing the closure {Q – {A,
B} }+ for each pair of
attributes {A, B} of Q, and checking
whether the closure includes A (or B), we can determine whether Q is in BCNF.
3. Dependency-Preserving and Nonadditive (Lossless) Join Decomposition
into 3NF Schemas
So far,
in Algorithm 16.4 we showed how to achieve a 3NF design with the potential for
loss of information and in Algorithm 16.5 we showed how to achieve BCNF design
with the potential loss of certain functional dependencies. By now we know that
it is not possible to have all three of
the following: (1) guaranteed nonlossy design, (2) guaranteed dependency
preservation, and (3) all relations in BCNF.
As we have said before, the first condition is a must and cannot be
compromised. The second condition is desirable, but not a must, and may have to
be relaxed if we insist on achieving BCNF. Now we give an alternative algorithm
where we achieve conditions 1 and 2 and only guarantee 3NF. A simple
modification to Algorithm 16.4, shown as Algorithm 16.6, yields a decomposition
D of R that does the following:
Preserves dependencies
Has the nonadditive join property
Is such that each resulting relation schema in the
decomposition is in 3NF
Because
the Algorithm 16.6 achieves both the desirable properties, rather than only
functional dependency preservation as guaranteed by Algorithm 16.4, it is
preferred over Algorithm 16.4.
Algorithm 16.6. Relational Synthesis into 3NF
with Dependency Preservation and
Nonadditive Join Property
Input: A universal relation R and a set of functional dependencies F on the attributes of R.
Find a minimal cover G for F (use Algorithm
16.2).
For each left-hand-side X of a functional dependency that appears in G, cre-ate a relation schema in D
with attributes {X ∪ {A1}
∪ {A2}
... ∪ {Ak}
}, where X → A1, X → A2, ..., X → Ak are the only dependencies in G with X as
left-hand-side (X is the key of this
relation).
If none of the relation schemas in D contains a key of R, then create one more relation schema in D that contains attributes that form a key of R.7 (Algorithm 16.2(a) may be used to find a key.)
Eliminate redundant relations from the resulting set of relations in the relational database schema. A relation R is considered redundant if R is a projection of another relation S in the schema; alternately, R is subsumed by S.
Step 3 of
Algorithm 16.6 involves identifying a key K
of R. Algorithm 16.2(a) can be used
to identify a key K of R based on the set of given functional
dependencies F. Notice that the set
of functional dependencies used to determine a key in Algorithm 16.2(a) could
be either F or G, since they are equivalent.
Example 1 of Algorithm 16.6. Let us
revisit the example given earlier at the end
of
Algorithm 16.4. The minimal cover G
holds as before. The second step produces relations R1 and R2
as before. However, now in step 3, we will generate a relation corresponding to
the key {Emp_ssn, Pno}. Hence, the resulting design
contains:
R1 (Emp_ssn , Esal, Ephone, Dno)
R2 (Pno, Pname, Plocation)
R3 (Emp_ssn, Pno)
This
design achieves both the desirable properties of dependency preservation and
nonadditive join.
Example 2 of Algorithm 16.6 (Case X ). Consider the relation schema LOTS1A shown in Figure 15.13(a). Assume
that this relation is given as a universal relation with the following functional
dependencies:
FD1: Property_id → Lot#, County, Area
FD2: Lot#, County → Area, Property_id
FD3: Area → County
These
were called FD1, FD2, and FD5 in Figure 15.13(a). The meanings of the above
attributes and the implication of the above functional dependencies were
explained in Section 15.4. For ease of reference, let us abbreviate the above
attributes with the first letter for each and represent the functional
dependencies as the set
F : { P → LCA, LC → AP, A → C }.
If we
apply the minimal cover Algorithm 16.2 to F,
(in step 2) we first represent the set F
as
F : {P → L, P → C, P → A, LC → A, LC → P, A → C}.
In the
set F, P → A can be
inferred from P → LC and LC → A; hence P → A by tran-sitivity and is
therefore redundant. Thus, one possible minimal cover is
Minimal
cover GX: {P → LC, LC → AP, A → C }.
In step 2
of Algorithm 16.6 we produce design X
(before removing redundant rela-tions) using the above minimal cover as
Design X: R1
(P, L, C), R2 (L, C, A, P), and R3 (A, C).
In step 4
of the algorithm, we find that R3
is subsumed by R2 (that
is, R3 is always a
projection of R2 and R1 is a projection of R2 as well. Hence both of
those relations are redundant. Thus the 3NF schema that achieves both of the
desirable properties is (after removing redundant relations)
Design X: R2
(L, C, A, P).
or, in
other words it is identical to the relation LOTS1A (Lot#, County, Area, Property_id) that we had determined to be in
3NF in Section 15.4.2.
Example 2 of Algorithm 16.6 (Case Y ). Starting with LOTS1A as the universal relation and with the same given
set of functional dependencies, the second step of the minimal cover Algorithm
16.2 produces, as before
F: {P → C, P → A, P → L, LC → A, LC → P, A → C}.
The FD LC → A may be considered redundant
because LC → P and P → A implies LC → A by transitivity. Also, P → C may be considered to be redundant
because P → A and A → C implies P → C by
transitivity. This gives a different minimal cover as
Minimal
cover GY: { P → LA, LC → P, A → C }.
The
alternative design Y produced by the
algorithm now is
Design Y: S1
(P, A, L), S2 (L, C, P), and S3 (A, C).
Note that
this design has three 3NF relations, none of which can be considered as
redundant by the condition in step 4. All FDs in the original set F are preserved. The reader will notice
that out of the above three relations, relations S1 and S3 were pro-duced as the BCNF design
by the procedure given in Section 15.5 (implying that S2 is redundant in the presence of S1 and S3).
However, we cannot eliminate relation S2
from the set of three 3NF relations above since it is not a projection of
either S1 or S3. Design Y therefore remains as one possible
final result of applying Algorithm 16.6
to the given universal relation that provides relations in 3NF.
It is
important to note that the theory of nonadditive join decompositions is based
on the assumption that no NULL values are
allowed for the join attributes. The next section discusses some of the
problems that NULLs may
cause in relational decom-positions and provides a general discussion of the
algorithms for relational design by synthesis presented in this section.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.