Combining analytical and empirical approaches in tuning matrix transposition

  title={Combining analytical and empirical approaches in tuning matrix transposition},
  author={Q. Lu and S. Krishnamoorthy and P. Sadayappan},
  journal={2006 International Conference on Parallel Architectures and Compilation Techniques (PACT)},
Matrix transposition is an important kernel used in many applications. Even though its optimization has been the subject of many studies, an optimization procedure that targets the characteristics of current processor architectures has not been developed. In this paper, we develop an integrated optimization framework that addresses a number of issues, including tiling for the memory hierarchy, effective handling of memory misalignment, utilizing memory subsystem characteristics, and the… Expand
TTC: A high-performance Compiler for Tensor Transpositions
Adaptive Algorithm Selection Using an Integrated Hybrid Performance Modeling Approach
A hybrid performance modeling approach for adaptive algorithm selection on hierarchical clusters
  • W. Nasri, Sami Achour
  • Computer Science
  • ACS/IEEE International Conference on Computer Systems and Applications - AICCSA 2010
  • 2010
HPC Software Verification in Action: A Case Study with Tensor Transposition
Automatic transformation and optimization of applications on gpus and gpu clusters
HPTT: a high-performance tensor transposition C++ library
TTC: a tensor transposition compiler for multiple architectures