Cache-oblivious algorithms

  title={Cache-oblivious algorithms},
  author={M. Frigo and C. Leiserson and H. Prokop and S. Ramachandran},
  journal={40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039)},
  • M. Frigo, C. Leiserson, +1 author S. Ramachandran
  • Published 1999
  • Computer Science
  • 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039)
  • This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: no variables dependent on hardware parameters, such as cache size and cache-line length, need to be tuned to achieve optimality. Nevertheless, these algorithms use an optimal amount of work and move data optimally among multiple levels of cache. For a cache with size Z and… CONTINUE READING
    352 Citations

    Figures and Topics from this paper

    Cache-Oblivious Algorithms
    • 26
    • PDF
    Cache-efficient matrix transposition
    • S. Chatterjee, Sandeep Sen
    • Computer Science
    • Proceedings Sixth International Symposium on High-Performance Computer Architecture. HPCA-6 (Cat. No.PR00550)
    • 2000
    • 68
    • Highly Influenced
    • PDF
    An Experimental Comparison of Cache-oblivious and Cache-aware Programs DRAFT : DO NOT DISTRIBUTE
    • 2
    • Highly Influenced
    • PDF
    Optimizing Graph Algorithms for Improved Cache Performance
    • 75
    Resource Oblivious Sorting on Multicores
    • 26
    An experimental comparison of cache-oblivious and cache-conscious programs
    • 89
    • Highly Influenced
    • PDF
    The cache complexity of multithreaded cache oblivious algorithms
    • 23


    Towards a Theory of Cache-Efficient Algorithms ( Extended Abstract )
    • 7
    Towards an optimal bit-reversal permutation program
    • L. Carter, Kang Su Gatlin
    • Computer Science
    • Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280)
    • 1998
    • 28
    • PDF
    The input/output complexity of sorting and related problems
    • 1,294
    • Highly Influential
    • PDF
    The cache performance and optimizations of blocked algorithms
    • 735
    • PDF
    Hierarchical memory with block transfer
    • 208
    • PDF
    Recursive array layouts and fast parallel matrix multiplication
    • 123
    • PDF
    FFTs in external or hierarchical memory
    • D. Bailey
    • Computer Science
    • Proceedings of the 1989 ACM/IEEE Conference on Supercomputing (Supercomputing '89)
    • 1989
    • 322
    • PDF
    Deterministic distribution sort in shared and distributed memory multiprocessors
    • 116
    The influence of caches on the performance of sorting
    • 231
    • PDF