DRAM interfaces
The basic DRAM interface
The basic DRAM interface takes the processor generated address, places
the high order bits onto the memory address bus to form the row address and
asserts the RAS* signal. This partial address is latched internally by the
DRAM. The remaining low order bits, forming the column address, are then driven
onto the bus and the CAS* signal asserted. After the access time has expired,
the data appears on the Dout pin and is latched by the processor. The RAS* and
CAS* signals are then negated. This cycle is repeated for every access. The
majority of DRAM specifications define minimum pulse widths for the RAS* and
CAS* and these often form the major part in defining the memory access time. To
remain compatible with the PC–AT standard, memory refresh is performed every 15
microseconds.
This direct access method limits wait state-free operation to the lower
processor speeds. DRAM with 100 ns access time would only allow a 12.5 MHz
processor to run with zero wait states. To achieve 20 MHz operation needs 40 ns
DRAM, which is unavail-able today, or fast static RAM which is at a price.
Fortunately, the embedded system designer has more tricks up his sleeve to
improve DRAM performance for systems, with or without cache.
Page mode operation
One way of reducing the effective access time is to remove the RAS*
pulse width every time the DRAM was accessed. It needs to be pulsed on the
first access, but subsequent accesses to the same page (i.e. with the same row
address) would not require it and so are accessed faster. This is how the ‘page
mode’ versions of most 256 kb, 1 Mb and 4 Mb memory work. In page mode, the row
address is supplied as normal but the RAS* signal is left asserted. This selects
an internal page of memory within the DRAM where any bit of data can be
accessed by placing the column address and asserting CAS*. With 256 kb size
memory, this gives a page of 1 kbyte (512 column bits per DRAM row with 16
DRAMs in the array). A 2 kbyte page is available from 1 Mb DRAM and a 4 kbyte
page with 4 Mb DRAM.
This allows fast processors to work with slower memory and yet achieve
almost wait state-free operation. The first access is slower and causes wait
states but subsequent accesses within the selected page are quicker with no
wait states.
However, there is one restriction. The maximum time that the RAS* signal
can be asserted during page mode operation is often specified at about 10
microseconds. In non-PC designs, the refresh interval is frequently adjusted to
match this time, so a refresh cycle will always occur and prevents a
specification viola-tion. With the PC standard of 15 microseconds, this is not
possible. Many chip sets neatly resolve the situation by using an internal
counter which times out page mode access after 10 microseconds.
Page interleaving
Using a page mode design only provides greater perform-ance when the
memory cycles exhibit some form of locality, i.e. stay within the page
boundary. Every access outside the boundary causes a page miss and two or three
wait states. The secret, as with caches, is to increase the hits and reduce the
misses. Fortunately, most accesses are sequential or localised, as in program
subrou-tines and some data structures. However, if a program is fre-quently
accessing data, the memory activity often follows a code– data–code–data access
pattern. If the code areas and data areas are in different pages, any benefit
that page mode could offer is lost. Each access changes the page selection,
incurring wait states. The solution is to increase the number of pages
available. If the memory is divided into several banks, each bank can offer a
selected page, increasing the number of pages and, ultimately, the number of
hits and performance. Again, extensive hardware support is needed and is
frequently provided by the PC chip set.
Page interleaving is usually implemented as a one, two or four way
system, depending on how much memory is installed. With a four way system,
there are four memory banks, each with their own RAS* and CAS* lines. With 4
Mbyte DRAM, this would offer 16 Mbytes of system RAM. The four way system
allows four pages to be selected within page mode at any one time. Page 0 is in
bank 1, page 1 in bank 2, and so on, with the sequence restarting after four
banks.
With interleaving and Fast Page Mode devices, inexpensive 85 ns DRAM can
be used with a 16 MHz processor to achieve a 0.4 wait state system. With no
page mode interleaving, this system would insert two wait states on every
access. With the promise of faster DRAM, future systems will be able to offer
33–50 MHz with very good performance — without the need for cache memory and
its associated costs and complexity.
Burst mode operation
Some versions of the DRAM chip, such as page mode, static column or
nibble mode devices, do not need to have the RAS/CAS cycle repeated and can
provide data much faster if only the new column address is given. This has
allowed the use of a burst fill memory interface, where the processor fetches
more data than it needs and keeps the extra data in an internal cache ready for
future use. The main advantage of this system is in reducing the need for fast
static RAMs to realise the processor’s performance. With 60 ns page mode DRAM,
a 4-1-1-1 (four clocks for the first access, single cycle for the remaining
burst) memory system can easily be built. Each 128 bits of data fetched in such
a way takes only seven clock cycles, compared with five in the fastest possible
system. If burst-ing was not supported, the same access would take 16 clocks.
This translates to a very effective price performance — a 4-1-1-1 DRAM system
gives about 90% of the performance of a more expensive 2-1-1-1 static RAM
design. This interface is used on the higher performance processors where it is
used in conjunction with on-chip caches. The burst fill is used to load a
complete line of data within the cache.
This allows fast processors to work with slower memory and yet achieve
almost wait state-free operation. The first access is slower and causes wait
states but subsequent accesses within the selected page are quicker with no
wait states.
EDO memory
EDO stands for extended data out memory and is a form of fast page mode
RAM that has a quicker cycling process and thus faster page mode access. This
removes wait states and thus im-proves the overall performance of the system.
The improvement is achieved by fine tuning the CAS* operation.
With fast page mode when the RAS* signal is still asserted, each time
the CAS* signal goes high the data outputs stop assert-ing the data bus and go
into a high impedance mode. This is used to simplify the design by using this
transistion to meet the timing requirements. It is common with this type of
design to perma-nently ground the output enable pin. The problem is that this
requires the CAS* signal to be asserted until the data from the DRAM is latched
by the processor or bus master. This means that the next access cannot be
started until this has been completed, causing delays.
EDO memory does not cause the outputs to go to high impedance and it
will continue to drive data even if the CAS* signal is removed. By doing this,
the CAS* precharge can be started for the next access while the data from the
previous access is being latched. This saves valuable nanoseconds and can mean
the removal of a wait state. With very high performance proces-sors, this is a
big advantage and EDO type DRAM is becoming the de facto standard for PCs and workstations or any other application that needs high performance memory.
Related Topics
Privacy Policy, Terms and Conditions, DMCA Policy and Compliant
Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.