Automated Operation Minimization of Tensor Contraction Expressions in Electronic Structure Calculations

@inproceedings{Hartono2005AutomatedOM,
  title={Automated Operation Minimization of Tensor Contraction Expressions in Electronic Structure Calculations},
  author={Albert Hartono and Alexander Sibiryakov and Marcel Nooijen and Gerald Baumgartner and David E. Bernholdt and So Hirata and Chi-Chung Lam and Russell M. Pitzer and J. Ramanujam and P. Sadayappan},
  booktitle={International Conference on Computational Science},
  year={2005}
}
Complex tensor contraction expressions arise in accurate electronic structure models in quantum chemistry, such as the Coupled Cluster method. Transformations using algebraic properties of commutativity and associativity can be used to significantly decrease the number of arithmetic operations required for evaluation of these expressions, but the optimization problem is NP-hard. Operation minimization is an important optimization step for the Tensor Contraction Engine, a tool being developed… 

Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations

An effective algorithm for common subexpression identification is developed and its effectiveness on tensor contraction expressions for coupled cluster equations is demonstrated.

Performance optimization of tensor contraction expressions for many-body methods in quantum chemistry.

An effective algorithm for operation minimization with common subexpression identification is described and its effectiveness on tensor contraction expressions for coupled cluster equations is demonstrated and a library for efficient index permutation of multidimensional tensors is described.

Framework for Distributed Contractions of Tensors with Symmetry

A novel approach is introduced that avoids data redistribution in contracting symmetric tensors while avoiding redundant storage and maintaining load balance and is presented on two parallel supercomputers.

CAST: Contraction Algorithm for Symmetric Tensors

A novel approach that avoids data redistribution during contraction of symmetric tensors while also bypassing redundant storage and maintaining load balance is introduced, and a novel approach to tensor redistribution that can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions is presented.

Faster identification of optimal contraction sequences for arbitrary tensor networks

This talk presents a novel approach to the search for an optimal gate application sequence which performs several orders of magnitude faster than existing search algorithms, while still guaranteeing identification of an optimal evaluation sequence for a given quantum circuit.

Integrated compiler optimizations for tensor contractions

This dissertation addresses several performance optimization issues in the context of the Tensor Contraction Engine (TCE), a domain-specific compiler to synthesize parallel, out-of-core programs for

A Communication-Optimal Framework for Contracting Distributed Tensors

A framework with three fundamental communication operators to generate communication-efficient contraction algorithms for arbitrary tensor contractions is developed and it is shown that for a given amount of memory per processor, the framework is communication optimal for all tensorcontractions.

AutoHOOT: Automatic High-Order Optimization for Tensors

This work introduces AutoHOOT, the first automatic differentiation framework targeting at high-order optimization for tensor computations, which contains a new explicit Jacobian / Hessian expression generation kernel whose outputs maintain the input tensors' granularity and are easy to optimize.

References

SHOWING 1-10 OF 14 REFERENCES

A direct product decomposition approach for symmetry exploitation in many-body methods. I. Energy calculations

The savings of the fully vectorizable direct product decomposition (DPD) method outlined here is associated with individual (linear) contractions, and is therefore applicable to both linear and nonlinear coupled‐cluster models, as well as many body perturbation theory.

Computer algebra symbolic and algebraic computation

This volume is the first systematic and complete treatment of computergebra and presents the basic problems of computer algebra and the best algorithms now known for their solution with their mathematical foundations, and complete references to the original literature.

A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry

This paper discusses an approach to the synthesis of high-performance parallel programs for a class of computations encountered in quantum chemistry and physics. These computations are expressible as

Performance optimization of a class of loops implementing multidimensional integrals

This thesis addresses the performance optimization of a class of loops that implement multi-dimensional summations and enhances the solutions to the various optimization problems to address the practically significant issues of sparsity, use of fast Fourier transforms, and utilization of common sub-expressions.

On Optimizing a Class of Multi-Dimensional Loops with Reductions for Parallel Execution

This paper addresses the compile-time optimization of a form of nested-loop computation that is motivated by a computational physics application and a pruning search strategy for determination of an optimal form is developed.

An efficient reformulation of the closed‐shell coupled cluster single and double excitation (CCSD) equations

The closed‐shell CCSD equations are reformulated in order to achieve superior computational efficiency. Using a spin adaptation scheme based on the unitary group approach (UGA), we have obtained a

Solving bigger problems-by decreasing the operation count and increasing the computation bandwidth

The purpose is to illustrate the computational complexity of modeling large (in wavelengths) electromagnetic problems and to suggest some ways by which the computational requirements can be reduced.

Arithmetic complexity of computations

Three examples of polynomials modulo a polynomial Cyclic convolution and discrete Fourier transform are shown.

Crafting a Compiler

Crafting a Compiler presents a practical approach to compiler construction with thorough coverage of the material and examples that clearly illustrate the concepts in the book.

Factor graphs and the Sum-Product Algorithm