Quantitative Principles of
Computer Design
The most
important and pervasive principle of computer design is to make the common case
fast
In
applying this simple principle, we have to decide what the frequent case is and
how much performance can be improved by making that case faster. A fundamental
law, called Amdahl’s
Law, can
be used to quantify this principle.
Amdahl’s Law
The
performance gain that can be obtained by improving some portion of a computer
can be calculated using Amdahl’s Law. Amdahl’s Law states that the performance improvement
to
be gained
from using some faster mode of execution is limited by the fraction of the time
the faster mode can be used. Amdahl’s Law defines the speedup that can be
gained by using a
particular
feature.
Speedup
is the Ratio
Speedup=
Performance for entire task using enhancement when possible / Performance for
entire task without using enhancement
Alternatively,
Execution
Time for entire task without using enhancement Speedup=Execution Time for
entire task using enhancement when possible
Speedup
tells us how much faster a task will run using the machine with the enhancement
as opposed to the original machine. Amdahl’s Law gives us a quick way to find
the speedup
from some
enhancement, which depends on two factors:
1. The
fraction of the computation time in the original machine that can be converted
to take advantage of the enhancement—For example, if 20 seconds of the
execution time of a program that takes 60 seconds in total can use an
enhancement, the fraction is 20/60. This value, which we will call
Fractionenhanced, is always less than or equal to 1.
2. The
improvement gained by the enhanced execution mode; that is, how much faster the
task
would run if the enhanced mode were used for the entire program— This value is the
time of the original mode over the time of the enhanced mode: If the enhanced
mode takes 2 seconds for some portion of the program that can completely use
the mode, while the original mode took 5 seconds for the same portion, the
improvement is 5/2. We will call this value, which is always greater than 1,
Speedupenhanced.
The
execution time using the original machine with the enhanced mode will be the
time spent using the unenhanced portion of the machine plus the time spent
using the enhancement:
Amdahl’s
Law can serve as a guide to how much an enhancement will improve performance
and how to distribute resources to improve cost/performance.
The CPU Performance Equation
Essentially
all computers are constructed using a clock running at a constant rate. These
discrete time events are called ticks, clock ticks, clock periods, clocks,
cycles, or clock cycles. Computer designers refer to the time of a clock period
by its duration (e.g., 1 ns) or by its rate (e.g., 1 GHz). CPU time for a
program can then be expressed
CPU Time
= CPU Clock Cycles Per a Program X Clock Cycle Time
In
addition to the number of clock cycles needed to execute a program, we can also
count the number of instructions executed—the instruction path length or
instruction count (IC). If we know the number of clock cycles and the
instruction count we can calculate the average number of clock cycles per
instruction (CPI).
CPI is
computed as
CPI= CPU
Clock Cycles Per a Program / Instruction Count
This
allows us to use CPI in the execution time formula:
CPU time
= Instruction count X Clock Cycle Time X Cycles per Instruction
Principle of Locality
Locality
of reference means: Programs tend to reuse data and instructions they have used
recently. A widely held rule of thumb is that a program spends 90% of its
execution time in only 10% of the code. An implication of locality is that we
can predict with reasonable accuracy what instructions and data a program will
use in the near future based on its accesses in the recent past.
Locality
of reference also applies to data accesses, though not as strongly as to code
accesses. Two different types of locality have been observed. Temporal locality
states that recently accessed items are likely to be accessed in the near
future. Spatial locality says that items whose addresses are near one another
tend to be referenced close together in time.
Advantage of Parallelism
Advantage
of parallelism is one of the most important methods for improving performance.
We give three brief examples, which are expounded on in later chapters. Our
first example is the use of parallelism at the system level. To improve the
throughput performance on a typical server benchmark, such as SPECWeb or TPC,
multiple processors and multiple disks can be used. The workload of handling
requests can then be spread among the CPUs or disks resulting in improved
throughput. This is the reason that scalability is viewed as a valuable asset
for server applications.
At the
level of an individual processor, taking advantage of parallelism among
instructions is critical to achieving high performance. This can be done to do
this is through pipelining. The basic idea behind pipelining is to overlap the execution
of instructions, so as to reduce the total time to complete a sequence of
instructions. Viewed from the perspective of the CPU performance equation, we
can think of pipelining as reducing the CPI by allowing instructions that take
multiple cycles to overlap.
A key
insight that allows pipelining to work is that not every instruction depends on
its immediate predecessor, and thus, executing the instructions completely or
partially in parallel may be possible.
Parallelism
can also be exploited at the level of detailed digital design. For example set
associative caches use multiple banks of memory that are typical searched in
parallel to find a desired item. Modern ALUs use carry-lookahead, which uses
parallelism to speed the process of computing sums from linear in the number of
bits in the operands to logarithmic.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.