Phase 5: Physical Database Design
Physical database design is the process of choosing specific file storage structures and access paths for the database files to achieve good performance for the various database applications. Each DBMS offers a variety of options for file organizations and access paths. These usually include various types of indexing, clustering of related records on disk blocks, linking related records via pointers, and various types of hashing techniques (see Chapters 17 and 18). Once a specific DBMS is chosen, the physical database design process is restricted to choosing the most appropriate structures for the database files from among the options offered by that DBMS. In this section we give generic guidelines for physical design decisions; they hold for any type of DBMS. The following criteria are often used to guide the choice of physical database design options:
Response time. This is the elapsed time between submitting a database transaction for execution and receiving a response. A major influence on response time that is under the control of the DBMS is the database access time for data items referenced by the transaction. Response time is also influenced by factors not under DBMS control, such as system load, operating system scheduling, or communication delays.
Space utilization. This is the amount of storage space used by the database files and their access path structures on disk, including indexes and other access paths.
Transaction throughput. This is the average number of transactions that can be processed per minute; it is a critical parameter of transaction systems such as those used for airline reservations or banking. Transaction through-put must be measured under peak conditions on the system.
Typically, average and worst-case limits on the preceding parameters are specified as part of the system performance requirements. Analytical or experimental techniques, which can include prototyping and simulation, are used to estimate the average and worst-case values under different physical design decisions to deter-mine whether they meet the specified performance requirements.
Performance depends on record size and number of records in the file. Hence, we must estimate these parameters for each file. Additionally, we should estimate the update and retrieval patterns for the file cumulatively from all the transactions. Attributes used for searching for specific records should have primary access paths and secondary indexes constructed for them. Estimates of file growth, either in the record size because of new attributes or in the number of records, should also be taken into account during physical database design.
The result of the physical database design phase is an initial determination of storage structures and access paths for the database files. It is almost always necessary to modify the design on the basis of its observed performance after the database system is implemented. We include this activity of database tuning in the next phase and cover it in the context of query optimization in Chapter 20.