Entity Types, Entity Sets, Attributes, and Keys
The ER model describes data as entities,
relationships, and attributes. In Section 7.3.1 we
introduce the concepts of entities and their attributes. We discuss entity
types and key attributes in Section 7.3.2. Then, in Section 7.3.3, we specify
the initial conceptual design of the entity types for the COMPANY database. Relationships are described in Section 7.4.
1. Entities and
Attributes
Entities and Their
Attributes. The basic object that the ER
model represents is an entity, which is a thing
in the real world with an independent existence. An entity may be an object
with a physical existence (for example, a particular person, car, house, or
employee) or it may be an object with a conceptual existence (for instance, a
company, a job, or a university course). Each entity has attributes—the particular properties that describe it. For example,
an EMPLOYEE entity may be described by the employee’s name, age, address, salary,
and job. A particular entity will have a
value for each of its attributes. The attribute values that describe
each entity become a major part of the data stored in the database.
Figure 7.3 shows two entities and the values of their attributes. The EMPLOYEE entity e1 has four attributes: Name, Address, Age, and Home_phone; their values are ‘John Smith,’ ‘2311 Kirby, Houston, Texas 77001’,
‘55’, and ‘713-749-2630’, respec-tively. The COMPANY entity c1 has three attributes: Name, Headquarters, and President; their values are ‘Sunco Oil’, ‘Houston’, and ‘John Smith’,
respectively.
Several types of attributes occur in the ER model: simple versus composite, single-valued versus multivalued, and stored versus derived. First we define these
attribute
types and illustrate their use via examples. Then we discuss the concept
of a NULL value for an attribute.
Composite versus Simple
(Atomic) Attributes. Composite attributes can be divided into smaller subparts, which represent more basic attributes
with indepen-dent meanings. For example, the Address
attribute of the EMPLOYEE entity shown in Figure 7.3 can be subdivided into Street_address, City, State, and Zip, with the values ‘2311 Kirby’, ‘Houston’, ‘Texas’, and ‘77001.’
Attributes that are not divisible are called simple or atomic attributes.
Composite attributes can form a hierarchy; for example, Street_address can be further subdivided into three simple component attributes: Number, Street, and Apartment_number, as shown in Figure 7.4. The value of a composite attribute is the
concatenation of the values of its component simple attributes.
Composite attributes are useful to model situations in which a user
sometimes refers to the composite attribute as a unit but at other times refers
specifically to its components. If the composite attribute is referenced only
as a whole, there is no
need to subdivide it into component attributes. For example, if there is
no need to refer to the individual components of an address (Zip Code, street,
and so on), then the whole address can be designated as a simple attribute.
Single-Valued versus
Multivalued Attributes. Most attributes have a single value for a particular entity; such attributes are called single-valued. For example, Age is a single-valued attribute of a person. In some cases an attribute can
have a set
of values for the same entity—for instance, a Colors attribute for a car, or a College_degrees attribute
for a person. Cars with one color have a single value, whereas two-tone cars have two color values. Similarly, one person may
not have a college degree, another person may have one, and a third person may
have two or more degrees; therefore, different people can have different numbers of values for the College_degrees attribute. Such attributes are called
multivalued. A multivalued attribute
may have lower and upper bounds to constrain the number of values allowed for each individual entity. For example,
the Colors attribute of a car may be restricted to have between one and three
values, if we assume that a car can have three colors at most.
Stored versus Derived
Attributes. In some cases, two (or more)
attribute values are related—for example, the Age and Birth_date attributes of a person. For a particular person entity, the value of Age can be determined from the current (today’s) date and the value of that
person’s Birth_date. The Age attribute is hence called a derived
attribute and is said to be derivable
from the Birth_date attribute, which is called a stored
attribute. Some attribute values can be derived from related entities; for example, an attribute Number_of_employees of a DEPARTMENT entity can be derived by counting the number of employees related to
(working for) that department.
NULL Values. In some cases, a particular entity may not have an applicable value for an attribute. For example, the Apartment_number
attribute of an address applies only to addresses that are in apartment
buildings and not to other types of residences, such as single-family homes.
Similarly, a College_degrees attribute applies only to people with college degrees. For such
situations, a special value called NULL is
created. An address of a single-family home would have NULL for its Apartment_number
attribute, and a person with no college degree
would have
NULL for College_degrees. NULL can also be used if we do not know the value of an attribute for a
particular entity—for example, if we do not know the home phone num-ber of
‘John Smith’ in Figure 7.3. The meaning of the former type of NULL is not applicable, whereas the meaning of the latter is unknown. The unknown category of NULL can be further classified into two cases. The first case arises when it
is known
that the attribute value exists but is missing—for instance, if the Height attribute of a person is listed as NULL. The
second case arises when it is not known
whether the attribute value exists—for example, if the Home_phone attribute of a person is NULL.
Complex Attributes. Notice that, in general, composite and multivalued attributes can be
nested arbitrarily. We can represent arbitrary nesting by grouping components
of a composite attribute between parentheses () and separating the components
with commas, and by displaying multivalued attributes between braces { }. Such
attributes are called complex attributes.
For example, if a person can have more than one residence and each residence
can have a single address and multiple phones, an attribute Address_phone for a person can be specified as shown in Figure 7.5. Both Phone and Address are themselves composite attributes.
2. Entity Types, Entity
Sets, Keys, and Value Sets
Entity Types and Entity Sets. A
database usually contains groups of entities that are
similar. For example, a company employing hundreds of employees may want to
store similar information concerning each of the employees. These employee
entities share the same attributes, but each entity has its own value(s) for each attribute. An entity
type defines a collection (or set) of entities that have the same
attributes. Each entity type in the
database is described by its name and attributes. Figure 7.6 shows two entity
types: EMPLOYEE and COMPANY, and a list of some of the
attributes for
each. A
few individual entities of each type are also illustrated, along with the
values of their attributes. The collection of all entities of a particular
entity type in the data-base at any point in time is called an entity set; the entity set is usually
referred to using the same name as the entity type. For example, EMPLOYEE refers to both a type of
entity as well as the current set of
all employee entities in the database.
An entity
type is represented in ER diagrams (see Figure 7.2) as a
rectangular box enclosing the entity type name. Attribute names are enclosed in
ovals and are attached to their entity type by straight lines. Composite
attributes are attached to their component attributes by straight lines.
Multivalued attributes are displayed in double ovals. Figure 7.7(a) shows a CAR entity type in this notation.
An entity
type describes the schema or intension for a set of entities that share the same structure. The collection of
entities of a particular entity type is grouped into an entity set, which is
also called the extension of the
entity type.
Key Attributes of an Entity Type. An
important constraint on the entities of an
entity
type is the key or uniqueness constraint on attributes. An
entity type usually
has one
or more attributes whose values are distinct for each individual entity in the
entity set. Such an attribute is called a key
attribute, and its values can be used to identify each entity uniquely. For
example, the Name attribute is a key of the COMPANY
entity
type in Figure 7.6 because no two companies are allowed to have the same name. For the PERSON entity type, a typical key
attribute is Ssn (Social
Security number). Sometimes several attributes together form a key, meaning
that the combination of the attribute
values must be distinct for each entity. If a set of attributes possesses this
property, the proper way to represent this in the ER model that we describe
here is to define a composite attribute
and designate it as a key attribute of the entity type. Notice that such a
composite key must be minimal; that
is, all component attributes must be included in the composite attribute to
have the uniqueness property. Superfluous attributes must not be included in a
key. In ER diagrammatic notation, each key attribute has its name underlined inside the oval, as
illustrated in Figure 7.7(a).
Specifying
that an attribute is a key of an entity type means that the preceding
uniqueness property must hold for every
entity set of the entity type. Hence, it is a constraint that prohibits any
two entities from having the same value for the key attribute at the same time.
It is not the property of a particular entity set; rather, it is a constraint
on any entity set of the entity type
at any point in time. This key con-straint (and other constraints we discuss
later) is derived from the constraints of the miniworld that the database
represents.
Some
entity types have more than one key
attribute. For example, each of the Vehicle_id
and Registration attributes of the entity type CAR (Figure 7.7) is a key in its own right. The Registration attribute is an example of a
composite key formed from two simple component attributes, State and Number, neither of which is a key on
its own. An entity type may also have no
key, in which case it is called a weak
entity type (see Section 7.5).
In our
diagrammatic notation, if two attributes are underlined separately, then each is
a key on its own. Unlike the relational model (see Section 3.2.2), there is
no con-cept of primary key in the ER model that we present here; the primary
key will be chosen during mapping to a relational schema (see Chapter 9).
Value Sets (Domains) of Attributes. Each
simple attribute of an entity type is
associated
with a value set (or domain of values), which specifies the
set of values that may be assigned to that attribute for each individual
entity. In Figure 7.6, if the range of ages allowed for employees is between 16
and 70, we can specify the value set of the Age
attribute of EMPLOYEE to be
the set of integer numbers between 16 and 70. Similarly, we can specify the
value set for the Name
attribute to be the set of strings of alphabetic characters separated by blank
characters, and so on. Value sets are not displayed in ER diagrams, and are
typically specified using the basic data
types available in most programming
languages, such as integer, string, Boolean, float, enumerated type, subrange, and so on. Additional data types
to represent common database types such as date, time, and other concepts are
also employed.
Mathematically,
an attribute A of entity set E whose value set is V can be defined as a function from E to the power set P(V ) of V:
A : E → P(V )
We refer
to the value of attribute A for
entity e as A(e). The previous
definition cov-ers both single-valued and multivalued attributes, as well as NULLs. A NULL value is represented by the empty set. For single-valued attributes,
A(e)
is restricted to being a singleton set
for each entity e in E, whereas there is no restriction on
multivalued attributes. For a composite attribute A, the value set V is the
power set of the Cartesian product of P(V1), P(V2), ..., P(Vn),
where V1, V2, ..., Vn are the value sets of the simple component attributes
that form A:
V = P (P(V1) × P(V2) × ... × P(Vn))
The value
set provides all possible values. Usually only a small number of these values
exist in the database at a particular time. Those values represent the data
from the current state of the miniworld. They correspond to the data as it
actually exists in the miniworld.
3. Initial Conceptual
Design of the COMPANY Database
We can now define the entity types for the COMPANY database, based on the requirements described in Section 7.2. After
defining several entity types and their attributes here, we refine our design
in Section 7.4 after we introduce the concept of a relationship. According to
the requirements listed in Section 7.2, we can identify four entity types—one
corresponding to each of the four items in the specification (see Figure 7.8):
An entity type DEPARTMENT with attributes Name, Number, Locations, Manager, and
Manager_start_date. Locations is the
only multivalued attribute. We can specify that both Name and Number are (separate) key attributes because each was specified to be unique.
An entity type PROJECT with attributes Name, Number, Location, and Controlling_department. Both
Name and Number are (separate) key attributes.
An entity type EMPLOYEE with attributes Name, Ssn, Sex, Address, Salary,
Birth_date,
Department, and Supervisor. Both
Name and Address may be com-posite attributes;
however, this was not specified in the requirements. We must go back to the
users to see if any of them will refer to the individual components of Name—First_name, Middle_initial, Last_name—or of Address.
An entity type DEPENDENT with attributes Employee, Dependent_name, Sex, Birth_date, and
Relationship (to the employee).
So far, we have not represented the fact that an employee can work on
several proj-ects, nor have we represented the number of hours per week an
employee works on each project. This characteristic is listed as part of the
third requirement in Section 7.2, and it can be represented by a multivalued
composite attribute of EMPLOYEE called Works_on with the simple components (Project, Hours). Alternatively, it can be represented as a multivalued composite
attribute of PROJECT called Workers with the simple components (Employee, Hours). We choose the first alternative in Figure 7.8, which shows each of
the entity types just described. The Name
attribute of EMPLOYEE
is shown as a composite attribute, presumably after
consultation with
the users.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2024 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.