Kristof Beyls and Erik H. D'Hollander and Yijun Yu.
"Visualization Enables the Programmer to Reduce Cache Misses".
In IASTED Conference on Parallel and Distributed Computing and Systems, pp. 781--786, 2002.


Links:

Abstract:

Many programs execution speed suffer from cache misses. These can be reduced on three different levels: the hardware level, the compiler level and the algorithm level. Much work has been done on the hardware level and the compiler level, however relatively little work has been done on assisting the programmer to increase the locality in his programs. In this paper, a method is proposed to visualize the locality which is not exploited by the cache hardware, based on the reuse distance metric. Visualizing the reuse distances allows the programmer to see the cache bottlenecks in its program at a single glance, which allows him to think about alternative ways to perform the same computation with increased cache efficiency. Furthermore, since the reuse distance is independent of cache size and associativity, the programmer will focus on optimizations which increase cache effectiveness for a wide range of caches. As a case study, the cache behavior of the MCF program, which has the worst cache behavior in the SPEC2000 benchmarks, is visualized. A simple optimization, based on the visualization, leads to consistent speedups from 24% to 48% on different processors and cache architectures, such as PentiumII, Itanium and Alpha.

Summary:

This paper presents a method for visualizing the locality of cache accesses. The two main types of cache misses are conflict and capacity which occur when the associativiey and size of the cache are too small, respectively. Capacity misses are the most dominate types of misses, however optimizations at the hardware and compiler levels mostly target conflict misses. The goal of the visualizations in this paper is to show the locality of the cache by displaying the distance reuse metric which measures the number of unique memory locations that are accessed between multiple references of a target location. This technique displays the metric within the source code, thereby allowing the programmer to see the exact location of the bottlenecks.

Bibtex:

@InProceedings{ beyls:2002:VEPR,
  author = 	"Kristof Beyls and Erik H. D'Hollander and Yijun Yu",
  title = 	"Visualization Enables the Programmer to Reduce Cache
                  Misses",
  booktitle = 	"IASTED Conference on Parallel and Distributed
                  Computing and Systems",
  pages = 	"781--786",
  month = 	"Nov",
  year = 	"2002",
}