 | The ACM Journal of Experimental Algorithmics |
Volume 4, Article 3, 1999
Matrix Multiplication: A Case Study of Enhanced Data Cache Utilization
by
and
and
http://www.jea.acm.org/1999/EironMatrix/
Abstract:
Modern machines present two challenges to algorithm
engineers and compiler writers: They have superscalar,
super-pipelined structure, and they have elaborate memory
subsystems specifically designed to reduce latency and increase
bandwidth. Matrix multiplication is a classical benchmark for
experimenting with techniques used to exploit machine architecture and
to overcome the limitations of contemporary memory subsystems.
This research aims at advancing the state of the art of algorithm
engineering by balancing instruction level parallelism, two levels of
data tiling, copying to provably avoid any cache conflicts, and
prefetching in parallel to computational operations, in order to fully
exploit the memory bandwidth. Measurements on IBM's RS/6000 43P
workstation show that the resultant matrix multiplication algorithm
outperforms IBM's ESSL by 6.8-31.8%, is less sensitive to the size of
the input data, and scales better.
In this paper we introduce a cache aware algorithm for matrix
multiplication. We also suggest generic
guidelines that may be applied to compute intensive algorithm to
efficiently utilize the data cache. We believe that some of
our concepts may be embodied in compilers.
- Keywords: Cache, prefetching, blocking,
matrix multiplication, BLAS
- CR Categories: G.4 [Mathematical Software]: Algorithm design and analysis
- Sources:
- The LaTeX version of the article;
this is a Unix file that contains the LaTeX source;
note that you must be a subscriber (institutional or individual) to access this file.
- The Postscript
version or PDF
version of the article;
note that you must be a subscriber (institutional or individual) to access
these files.
- The HTML version of the article;
note that you must be a subscriber (institutional or individual) to access
this file.
- The bibliography given in the article.
Received
|
Accepted
|
Final Revision
|
Published
|
April 1, 1999
|
January 20, 2000
|
January 24, 2000
|
May 10, 2000
|
Last updated and validated May 10, 2000, by editor@jea.acm.org