With the faster processors available today, the wait states incurred in external memory accesses start to dramatically reduce performance. To recover this, many designs implement a cache memory to buffer the processor from such delays. Once predomi-nantly used with high end systems, they are now appearing both on chip and as external designs.
Cache memory systems work because of the cyclical struc-tures within software. Most software structures are loops where pieces of code are repeatedly executed, albeit with different data. Cache memory systems store these loops so that after the loop has been fetched from main memory, it can be obtained from the cache for subsequent executions. The accesses from cache are faster than from main memory and thus increase the system’s throughput.
There are several criteria associated with cache design which affect its performance. The most obvious is cache size — the larger the cache, the more entries are stored and the higher the hit rate. For the 80x86 processor architecture, the best price perform-ance is obtained with a 64 kbyte cache. Beyond this size, the cost of getting extra performance is extremely high.
The set associativity is another criterion. It describes the number of cache entries that could possibly contain the required data. With a direct map cache, there is only a single possibility, with a two way system, there are two possibilities, and so on. Direct mapped caches can get involved in a bus thrashing situa-tion, where two memory locations are separated by a multiple of the cache size. Here, every time word A is accessed, word B is discarded from the cache. Every time word B is accessed, word A is lost, and so on. The cache starts thrashing and overall perform-ance is degraded. With a two way design, there are two possibili-ties and this prevents bus thrashing. The cache line refers to the number of consecutive bytes that are associated with each cache entry. Due to the sequential nature of instruction flow, if a cache hit occurs at the beginning of the line, it is highly probable that the rest of the line will be accessed as well. It is therefore prudent to burst fill cache lines whenever a miss forces a main memory access. The differences between set associativity and line length are not as clear as cache size. It is difficult to say what the best values are for a particular system. Cache performances are ex-tremely system and software dependent and, in practice, system performance increases of 20–30% are typical.