Optimising line length and cache size
This performance degradation is symptomatic of external bus thrashing due to the cache line length and/or burst fill length being wrong and leading to system inefficiencies. It is therefore important to get these values correct. If the burst fill length is greater than the number of sequential instructions executed be-fore a flow change, data is fetched which will not be used. This consumes valuable external bus bandwidth. If the burst length is greater than the line length, multiple cache lines have to be updated, which might destroy a cache entry for another piece of code that will be executed later. This destroys the efficiency of the cache mechanism and increases the cache flushing, again consum-ing external bus bandwidth. Both of these contribute to the noto-rious ‘bus thrashing’ syndrome where the processor spends vast amounts of time fetching data that it never uses. Some cache schemes allow line lengths of 1, 4, 8, 16 or 32 to be selected, however, most systems use a line and burst fill length of 4. Where there are large blocks of data to be moved, higher values can improve performance within these moves, but this must be offset by any affect on other activities.
Cache size is another variable which can affect perform-ance. Unfortunately, it always seems to be the case that the ideal cache is twice the size of that currently available! The biggest difficulty is that cache size and efficiency are totally software dependant — a configuration that works for one application is not necessarily the optimum for another.
The table shows some efficiency figures quoted by Intel in their 80386 Hardware Reference Manual and from this data, it is apparent that there is no clear cut advantage of one configuration over another. It is very easy to get into religious wars of cache organisation where one faction will argue that their particular organisation is right and that everything else is wrong. In practice, it is incredibly difficult to make such claims without measuring and benchmarking a real system. In addition, the advantages can be small compared to other performance techniques such as software optimisation. In the end, the bigger the cache the better, irrespective of its set-associativity or not is probably the best maxim to remember.