Thomas Roeder

Learn More
Cache-oblivious algorithms have been advanced as a way of circumventing some of the difficulties of optimizing applications to take advantage of the memory hierarchy of modern microprocessors. These algorithms are based on the divide-and-conquer paradigm -- each division step creates sub-problems of smaller size, and when the working set of a sub-problem(More)
As processor complexity increases compilers tend to deliver subopti-mal performance. Library generators such as ATLAS, FFTW and SPIRAL overcome this issue by empirically searching in the space of possible program versions for the one that performs the best. Empirical search can also be applied by programmers, but because they lack a tool to automate the(More)
  • 1