PART 6
Database
Design Theory and
Normalization
Chapter 15
Basics of
Functional Dependencies
and Normalization for
Relational Databases
In Chapters 3 through 6, we presented various aspects of the relational
model and the languages associated with it. Each relation schema consists of a number of attributes, and the relational database schema consists of a
number of relation schemas. So far, we have assumed that attributes are grouped to form a relation schema by using the
common sense of the database designer or by mapping a database schema design
from a conceptual data model such as the ER or Enhanced-ER (EER) data model.
These models make the designer identify entity types and relationship types and
their respective attributes, which leads to a natural and logical grouping of
the attributes into relations when the mapping procedures discussed in Chapter
9 are followed. However, we still need some formal way of analyzing why one
grouping of attributes into a rela-tion schema may be better than another.
While discussing database design in Chapters 7 through 10, we did not develop
any measure of appropriateness or goodness
to measure the quality of the design, other than the intuition of the designer. In this chapter we discuss
some of the theory that has been developed with the goal of evaluating
relational schemas for design quality—that is, to measure for-mally why one set
of groupings of attributes into relation schemas is better than another.
There are two levels at which we can discuss the goodness of relation schemas. The first is the logical (or conceptual) level—how users interpret the relation
schemas and the meaning of their attributes. Having good relation schemas at
this level enables users to understand clearly the meaning of the data in the
relations, and hence to formulate their queries correctly. The second is the implementation (or physical storage) level—how
the tuples in a base relation are stored and updated. This level applies only to schemas of base relations—which will be
physically stored as files—whereas at the logical level we are interested in
schemas of both base rela-tions and views (virtual relations). The relational
database design theory developed in this chapter applies mainly to base relations, although some criteria
of appropri-ateness also apply to views, as shown in Section 15.1.
As with many design problems, database design may be performed using two
approaches: bottom-up or top-down. A bottom-up
design methodology (also called design
by synthesis) considers the basic relationships among individual attributes as the starting point and uses those
to construct relation schemas. This approach
is not very popular in practice because it suffers from the problem of having to collect a large number
of binary relationships among attributes as the starting point. For practical
situations, it is next to impossible to capture binary relationships among all
such pairs of attributes. In contrast, a top-down
design methodology (also called design
by analysis) starts with a number of groupings of attributes into relations that exist together naturally, for
example, on an invoice, a form, or a report. The relations are then analyzed
individually and collectively, leading to further decomposition until all
desirable properties are met. The theory described in this chapter is
applicable to both the top-down and bottom-up design approaches, but is more
appropriate when used with the top-down approach.
Relational database design ultimately produces a set of relations. The
implicit goals of the design activity are information
preservation and minimum redundancy.
Information is very hard to quantify—hence we consider information preservation
in terms of maintaining all concepts, including attribute types, entity types,
and relationship types as well as generalization/specialization relationships,
which are described using a model such as the EER model. Thus, the relational
design must preserve all of these concepts, which are originally captured in
the conceptual design after the conceptual to logical design mapping.
Minimizing redundancy implies minimizing redundant storage of the same
information and reducing the need for multiple updates to maintain consistency
across multiple copies of the same information in response to real-world events
that require making an update.
We start this chapter by informally discussing some criteria for good
and bad rela-tion schemas in Section 15.1. In Section 15.2, we define the
concept of functional dependency, a formal constraint among
attributes that is the main tool for formally measuring the appropriateness of attribute groupings into relation
schemas. In Section 15.3, we discuss normal forms and the process of
normalization using func-tional dependencies. Successive normal forms are
defined to meet a set of desirable constraints expressed using functional
dependencies. The normalization procedure consists of applying a series of
tests to relations to meet these increasingly stringent requirements and
decompose the relations when necessary. In Section 15.4, we dis cuss more
general definitions of normal forms that can be directly applied to any given
design and do not require step-by-step analysis and normalization. Sections
to 15.7 discuss further normal
forms up to the fifth normal form. In Section
we introduce the multivalued
dependency (MVD), followed by the join dependency (JD) in Section 15.7. Section
15.8 summarizes the chapter.
Chapter 16 continues the development of the theory related to the design
of good relational schemas. We discuss desirable properties of relational
decomposition— nonadditive join property and functional dependency preservation
property. A general algorithm that tests whether or not a decomposition has the
nonadditive (or lossless) join
property (Algorithm 16.3 is also presented). We then discuss properties of functional dependencies and the
concept of a minimal cover of dependencies. We consider the bottom-up approach
to database design consisting of a set of algo-rithms to design relations in a
desired normal form. These algorithms assume as input a given set of functional
dependencies and achieve a relational design in a tar-get normal form while
adhering to the above desirable properties. In Chapter 16 we also define
additional types of dependencies that further enhance the evaluation of the goodness of relation schemas.
If Chapter 16 is not covered in a course, we recommend a quick
introduction to the desirable properties of decomposition and the discussion of
Property NJB in Section 16.2.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.