Further Topics in Functional Dependencies: Inference Rules, Equivalence,
and Minimal Cover
We introduced the concept of functional dependencies (FDs) in Section
15.2, illustrated it with some examples, and developed a notation to denote
multiple FDs over a single relation. We identified and discussed problematic
functional dependencies in Sections 15.3 and 15.4 and showed how they can be
eliminated by a proper decomposition of a relation. This process was described
as normalization and we showed how to
achieve the first through third normal forms (1NF through 3NF) given primary
keys in Section 15.3. In Sections 15.4 and 15.5 we provided generalized tests
for 2NF, 3NF, and BCNF given any number of candidate keys in a relation and
showed how to achieve them. Now we return to the study of functional dependencies
and show how new dependencies can be inferred from a given set and dis-cuss the
concepts of closure, equivalence, and minimal cover that we will need when we
later consider a synthesis approach to design of relations given a set of FDs.
1. Inference Rules for
Functional Dependencies
We denote
by F the set of functional
dependencies that are specified on relation schema R. Typically, the schema designer specifies the functional
dependencies that are semantically
obvious; usually, however, numerous other functional dependencies hold in all legal relation instances among sets
of attributes that can be derived from and satisfy the dependencies in F. Those other dependencies can be inferred or deduced from the FDs in F.
In real
life, it is impossible to specify all possible functional dependencies for a
given
situation.
For example, if each department has one manager, so that
uniquely
determines Mgr_ssn (Dept_no → Mgr_ssn), and a manager has a unique
phone number called Mgr_phone (Mgr_ssn → Mgr_phone), then these two dependencies
together imply that Dept_no → Mgr_phone. This is an inferred FD and need not be explicitly stated in addition to
the two given FDs. Therefore, it is useful to define a concept called closure formally that includes all
possible dependencies that can be inferred from the given set F.
Definition. Formally, the set of all
dependencies that include F
as well as all dependencies that can be inferred from F is called the closure
of F; it is denoted by F+.
For
example, suppose that we specify the following set F of obvious functional dependencies on the relation schema in
Figure 15.3(a):
F = {Ssn → {Ename, Bdate, Address, Dnumber}, Dnumber → {Dname, Dmgr_ssn} }
Some of
the additional functional dependencies that we can infer from F are the
fol-lowing:
Ssn → {Dname, Dmgr_ssn}
Ssn → Ssn
Dnumber → Dname
An FD X → Y is inferred from a set of dependencies F specified on R if X → Y holds in every legal relation state r
of R; that is, whenever r satisfies all the dependencies in F, X
→ Y also holds in r. The
closure F+ of F is the set of all functional
dependencies that can be inferred from F.
To determine a systematic way to infer dependencies, we must discover a set of inference rules that can be used to
infer new dependencies from a given set of dependencies. We consider some of
these inference rules next. We use the notation F |=X → Y to denote that the functional dependency X → Y is inferred from the set of functional
dependencies F.
In the
following discussion, we use an abbreviated notation when discussing
functional dependencies. We concatenate attribute variables and drop the
commas for convenience. Hence, the FD {X,Y} → Z is abbreviated to XY → Z, and the FD {X, Y, Z}
→ {U,
V} is abbreviated to XYZ → UV. The
following six rules IR1 through IR6 are well-known inference rules
for functional dependencies:
IR1 (reflexive rule)1: If X ⊇ Y, then X →Y.
IR2 (augmentation rule)2:
{X
→ Y} |=XZ
→ YZ.
IR3 (transitive rule): {X
→ Y, Y → Z} |=X
→ Z.
IR4 (decomposition, or projective,
rule): {X → YZ} |=X
→ Y.
IR5 (union, or additive, rule): {X
→ Y, X → Z} |=X
→ YZ.
IR6 (pseudotransitive rule): {X
→ Y, WY
→ Z} |=WX
→ Z.
The
reflexive rule (IR1) states
that a set of attributes always determines itself or any of its subsets, which
is obvious. Because IR1
generates dependencies that are always true, such dependencies are called trivial. Formally, a functional
dependency X → Y is trivial if X ⊇ Y; otherwise, it is nontrivial. The augmentation rule (IR2) says that adding the same set of attributes to both the left- and
right-hand sides of a dependency results in another valid dependency.
According to IR3, functional
dependencies are transitive. The decomposition rule (IR4) says that we can remove
attributes from the right-hand side of a dependency; applying this rule
repeatedly can decompose the FD X → {A1, A2,
..., An} into the set of
dependencies {X → A1, X → A2, ..., X → An}. The
union rule (IR5) allows
us to do the opposite; we can combine a set of dependencies {X → A1, X → A2, ..., X → An} into the single FD X → {A1, A2, ..., An}.
The pseudotransitive rule (IR6) allows us to replace a set of attributes Y on the left hand side of a dependency
with another set X that functionally
determines Y, and can be derived from
IR2 and IR3 if we augment the first functional dependency X → Y with W (the augmentation rule) and then apply the transitive rule.
One cautionary note regarding the use of
these rules. Although X → A and X → B implies X → AB by the union rule stated above, X → A and Y → B does imply that XY → AB. Also, XY → A does not necessarily imply either
X → A or Y → A.
Each of
the preceding inference rules can be proved from the definition of functional
dependency, either by direct proof or by
contradiction. A proof by contradiction assumes that the rule does not
hold and shows that this is not possible. We now prove that the first three
rules IR1 through IR3 are valid. The second proof is
by contradiction.
Proof of IR1. Suppose that X ⊇ Y and that two tuples t1 and t2 exist in some relation instance r of R
such that t1 [X] = t2
[X]. Then t1[Y] = t2[Y] because X ⊇ Y;
hence, X → Y must hold in r.
Proof of IR2 (by contradiction). Assume
that X → Y holds in a relation instance r of R but that XZ → YZ does not hold. Then there must
exist two tuples t1 and
t2 in r such that (1) t1 [X] =
t2 [X], (2) t1 [Y] =
t2 [Y], (3) t1 [XZ] =
t2 [XZ], and
t1 [YZ] ≠ t2 [YZ]. This is not possible because from (1) and (3) we deduce
t1 [Z] = t2 [Z],
and from (2) and (5) we deduce (6) t1 [YZ]
= t2 [YZ], contra-dicting
(4).
Proof of IR3. Assume that (1) X → Y and (2) Y → Z both hold in a relation r. Then for any two tuples t1 and t2 in r such
that t1 [X] = t2
[X], we must have (3) t1 [Y] = t2 [Y], from assumption
(1); hence we must also have (4) t1 [Z]
= t2 [Z] from (3) and assumption (2); thus X → Z must hold in r.
Using
similar proof arguments, we can prove the inference rules IR4 to IR6 and any additional valid
inference rules. However, a simpler way to prove that an inference rule for
functional dependencies is valid is to prove it by using inference rules that
have already been shown to be valid. For example, we can prove IR4 through IR6 by using IR1 through IR3 as follows.
Proof of IR4 (Using IR1 through IR3).
X → YZ (given).
YZ → Y (using IR1 and knowing that YZ ⊇ Y).
X → Y (using IR3 on 1 and 2).
Proof of IR5 (using IR1 through IR3).
X →Y (given).
X → Z (given).
X → XY (using IR2 on 1 by augmenting with X; notice that XX = X).
XY → YZ (using IR2 on 2 by augmenting with Y).
X → YZ (using IR3 on 3 and 4).
Proof of IR6 (using IR1 through IR3).
X → Y (given).
WY → Z (given).
WX → WY (using IR2 on 1 by augmenting with W).
WX → Z (using IR3 on 3 and 2).
It has
been shown by Armstrong (1974) that inference rules IR1 through IR3 are sound and complete. By sound, we mean that given a set of
functional dependencies F specified
on a relation schema R, any
dependency that we can infer from F by using IR1 through IR3 holds in
every relation state r of R that satisfies the dependencies in
F. By complete, we mean that using IR1 through IR3 repeatedly to infer dependencies until no more dependencies
can be inferred results in the complete set of all possible dependencies that can be inferred from F. In other words, the set of
dependencies F+, which we
called the closure of F, can be determined from F by using only inference rules IR1 through IR3. Inference rules IR1 through IR3 are known as Armstrong’s inference rules.3
Typically,
database designers first specify the set of functional dependencies F that can easily be determined from the
semantics of the attributes of R;
then IR1, IR2, and IR3 are used
to infer additional functional dependencies that will also hold on R. A systematic way to determine these
additional functional dependencies is first to determine each set of attributes X that appears as a left-hand side of some func-tional dependency
in F and then to determine the set of
all attributes that are dependent on X.
Definition. For each such set of attributes X,
we determine the set X+ of attrib-utes that are functionally determined by X based on F; X+ is
called the closure of X under F. Algorithm 16.1 can be
used to calculate X+.
Algorithm 16.1. Determining X+, the
Closure of X
under F
Input: A set F of FDs on a relation schema R, and a set of attributes X,
which is a subset of R.
X+ := X;
repeat
oldX+ := X+;
for each
functional dependency Y → Z in F do if X+ ⊇ Y then X+ := X+
∪ Z;
until (X+ = oldX+);
Algorithm
16.1 starts by setting X+
to all the attributes in X. By IR1, we know that all these
attributes are functionally dependent on X.
Using inference rules IR3 and IR4, we add attributes to X+, using each functional
dependency in F. We keep going
through all the dependencies in F
(the repeat loop) until no more
attributes are added to X+
during a complete cycle (of the for loop) through the dependencies in F. For example, consider the relation
schema EMP_PROJ in Figure 15.3(b); from the
semantics of the attributes, we specify the following set F of functional dependen-cies that should hold on EMP_PROJ:
F = {Ssn → Ename,
Pnumber → {Pname, Plocation},
{Ssn, Pnumber} → Hours}
Using
Algorithm 16.1, we calculate the following closure sets with respect to F:
{Ssn} + = {Ssn, Ename}
{Pnumber} + = {Pnumber, Pname, Plocation}
{Ssn, Pnumber} + = {Ssn, Pnumber, Ename, Pname, Plocation, Hours}
Intuitively,
the set of attributes in the right-hand side in each line above represents all
those attributes that are functionally dependent on the set of attributes in
the left-hand side based on the given set F.
2. Equivalence of Sets
of Functional Dependencies
In this
section we discuss the equivalence of two sets of functional dependencies.
First, we give some preliminary definitions.
Definition. A set of functional dependencies F is said to cover another set of functional
dependencies E if every FD in E is also in F+; that is, if every dependency in E can be inferred from F; alternatively, we can say that E is covered by F.
Definition. Two sets of functional dependencies E and F are equivalent if E+ =
F+. Therefore, equivalence means that every FD in E can be inferred from F, and every FD in F can be inferred from E;
that is, E is equivalent to F if both the conditions—E covers F and F covers E—hold.
We can
determine whether F covers E by calculating X+ with respect to
F for each FD X → Y in E, and
then checking whether this X+ includes the attributes in Y. If this is the case for every FD in
E, then F covers E. We determine
whether E and F are equivalent by checking that E covers F and F covers E. It is left to the reader as an exercise to show that the
following two sets of FDs are equivalent:
F = {A → C, AC → D, E → AD, E → H} and G = {A → CD, E
→ AH}.
3. Minimal Sets of
Functional Dependencies
Informally,
a minimal cover of a set of
functional dependencies E is a set of
functional dependencies F that
satisfies the property that every dependency in E is in the closure F+
of F. In addition, this property is
lost if any dependency from the set F
is removed; F must have no
redundancies in it, and the dependencies in F
are in a standard form. To satisfy these properties, we can formally define a
set of functional dependencies F to
be minimal if it satisfies the
following conditions:
Every dependency in F has a single attribute for its right-hand side.
We cannot replace any dependency X → A in F
with a dependency Y → A, where Y is a proper
subset of X, and still have a set of
dependencies that is equivalent to F.
We cannot remove any dependency from F and still have a set of dependencies
that is equivalent to F.
We can
think of a minimal set of dependencies as being a set of dependencies in a standard or canonical form and with no
redundancies. Condition 1 just represents every dependency in a canonical
form with a single attribute on the right-hand side. Conditions 2 and 3 ensure
that there are no redundancies in the dependencies either by having redundant
attributes on the left-hand side of a dependency (Condition 2) or by having a
dependency that can be inferred from the remaining FDs in F (Condition 3).
Definition. A minimal cover of a set of
functional dependencies E
is a minimal set of dependencies
(in the standard canonical form and without redundancy) that is equivalent to E. We can always find at least one minimal cover F for any set of dependencies E using Algorithm 16.2.
If
several sets of FDs qualify as minimal covers of E by the definition above, it is cus-tomary to use additional
criteria for minimality. For example,
we can choose the minimal set with the smallest
number of dependencies or with the smallest total length (the total
length of a set of dependencies is calculated by concatenating the dependencies and treating them as one
long character string).
Algorithm 16.2. Finding a Minimal Cover F for a Set of Functional Dependencies E
Input: A set of functional dependencies
E.
Set F := E.
Replace each functional dependency X → {A1, A2, ..., An}
in F by the n func-tional dependencies X
→A1, X →A2, ..., X → An.
For each functional dependency X → A in F
for each
attribute B that is an element of X
if { {F – {X
→ A} } ∪ { (X – {B}
) → A} } is equivalent to F then replace X → A with (X – {B} ) → A in F.
4.
For each remaining functional dependency
X
→ A in
F
if {F – {X → A} } is
equivalent to F,
then
remove X → A from F.
We
illustrate the above algorithm with the following:
Let the
given set of FDs be E : {B → A, D
→ A, AB → D}. We have to find the mini-mal cover of E.
All above dependencies are in canonical form (that
is, they have only one
attribute
on the right-hand side), so we have completed step 1 of Algorithm 16.2 and can
proceed to step 2. In step 2 we need to determine if AB → D has any redundant attribute on the
left-hand side; that is, can it be replaced by B → D or A → D?
Since B → A, by
augmenting with B on both sides (IR2), we have BB → AB, or B → AB (i). However, AB → D as given (ii).
Hence by the transitive rule (IR3), we get from (i) and (ii), B → D. Thus AB → D may be replaced by B → D.
We now have a set equivalent to original E, say E : {B → A, D → A, B → D}. No further reduction is possible in step 2 since all FDs have a
single attribute on the left-hand side.
In step 3 we look for a redundant FD in E . By using the transitive rule on B → D and D → A, we derive B → A. Hence B → A is redundant in E and can be eliminated.
Therefore, the minimal cover of E is {B → D, D
→ A}.
In
Section 16.3 we will see how relations can be synthesized from a given set of
dependencies E by first finding the
minimal cover F for E.
Next, we
provide a simple algorithm to determine the key of a relation:
Algorithm 16.2(a). Finding a
Key K for R Given a set F of Functional Dependencies
Input: A relation R and a set of functional dependencies F on the attributes of R.
Set K := R.
For each attribute A in K
{compute
(K – A)+ with respect to F;
if (K –
A)+ contains all the attributes in R, then set K := K – {A} };
In
Algoritm 16.2(a), we start by setting K
to all the attributes of R; we then
remove one attribute at a time and check whether the remaining attributes still
form a superkey. Notice, too, that Algorithm 16.2(a) determines only one key out of the possible candidate
keys for R; the key returned depends
on the order in which attributes are removed from R in step 2.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.