Phaseless Auxiliary-Field Quantum Monte Carlo on Graphical Processing Units.

  title={Phaseless Auxiliary-Field Quantum Monte Carlo on Graphical Processing Units.},
  author={James Shee and Evan J. Arthur and Shiwei Zhang and David R. Reichman and Richard A. Friesner},
  journal={Journal of chemical theory and computation},
  volume={14 8},
We present an implementation of phaseless Auxiliary-Field Quantum Monte Carlo (ph-AFQMC) utilizing graphical processing units (GPUs). The AFQMC method is recast in terms of matrix operations which are spread across thousands of processing cores and are executed in batches using custom Compute Unified Device Architecture kernels and the GPU-optimized cuBLAS matrix library. Algorithmic advances include a batched Sherman-Morrison-Woodbury algorithm to quickly update matrix determinants and… 
Accelerating Auxiliary-Field Quantum Monte Carlo Simulations of Solids with Graphical Processing Units.
This work outlines how auxiliary-field quantum Monte Carlo (AFQMC) can leverage graphical processing units (GPUs) to accelerate the simulation of solid state sytems and demonstrates the ability of AFQMC to systematically converge solid state calculations with respect to basis set and system size.
A Localized-Orbital Energy Evaluation for Auxiliary-Field Quantum Monte Carlo.
Phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) has recently emerged as a promising method for the production of benchmark-level simulations of medium- to large-sized molecules because of
Efficient Ab Initio Auxiliary-Field Quantum Monte Carlo Calculations in Gaussian Bases via Low-Rank Tensor Decomposition.
While the cost of conventional AFQMC calculations in Gaussian bases scales as O(N4) , it is shown that ground-state energies can be computed through tensor decomposition with reduced memory requirements and subquartic scaling.
Taming the Sign Problem in Auxiliary-Field Quantum Monte Carlo Using Accurate Wave Functions.
This work adapts a recently proposed fast multi-Slater local energy evaluation algorithm for fp-AFQMC, making the use of long expansions from selected configuration interaction methods feasible and demonstrating how these wave functions serve to mitigate the sign problem and accelerate convergence in quantum chemical problems.
Overcoming the Memory Bottleneck in Auxiliary Field Quantum Monte Carlo Simulations with Interpolative Separable Density Fitting.
We investigate the use of interpolative separable density fitting (ISDF) as a means to reduce the memory bottleneck in auxiliary field quantum Monte Carlo (AFQMC) simulations of real materials in
A heterogeneous CPU + GPU algorithm for variational two-electron reduced-density matrix driven complete active space self-consistent field theory.
This GPU-accelerated v2RDM-CASSCF algorithm is used to explore the electronic structure of the 3,k-circumacene and 3, k-periacene series and compare indicators of polyradical character in the lowest-energy singlet states to those observed for oligoacene molecules.
Taming the sign problem in auxiliary field quantum Monte Carlo using accurate trial wave functions
We explore different ways of incorporating accurate trial wave functions into free projection auxiliary field quantum Monte Carlo (fp-AFQMC). Trial states employed include coupled cluster singles and
Twenty Years of Auxiliary-Field Quantum Monte Carlo in Quantum Chemistry: An Overview and Assessment on Main Group Chemistry and Bond-Breaking
In this work, we present an overview of the phaseless auxiliary-field quantum Monte Carlo (ph-AFQMC) approach from a computational quantum chemistry perspective, and present a numerical assessment of
Binding and excitations in SixHy molecular systems using quantum Monte Carlo.
The results further corroborate that Si systems, and presumably also related main group IV and V elements of the periodic table (Ge, Sn, etc), exhibit some of the lowest fixed-node biases found in valence-only electronic structure QMC calculations.
Numerical assessment for accuracy and GPU acceleration of TD-DMRG time evolution schemes.
This paper comparatively studies the accuracy of three time evolution schemes in the TD-DMRG, the global propagation and compression method with the Runge-Kutta algorithm, the time dependent variational principle based methods with the matrix unfolding algorithm, and with the projector-splitting algorithm by performing benchmarks on the exciton dynamics of the Fenna-Matthews-Olson complex.


Monte Carlo MP2 on Many Graphical Processing Units.
It is numerically determine that the cost to achieve a given relative statistical uncertainty in an MC-MP2 energy increases as O(n3) or better with system size n, which may be compared with the O( n5) scaling of the conventional implementation of deterministic MP2.
Dynamic Precision for Electron Repulsion Integral Evaluation on Graphical Processing Units (GPUs).
It is shown that precision error can be effectively controlled by evaluating only the largest integrals in double precision, and a dynamic precision scheme is shown to be effective for an array of molecules ranging in size from 20 to nearly 2000 atoms.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo.
This work proposes a novel multiple rank delayed update scheme that enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, and improves computational efficiency via matrix-matrix operations instead of matrix-vector operations.
Quantum Chemistry on Graphical Processing Units. 3. Analytical Energy Gradients, Geometry Optimization, and First Principles Molecular Dynamics.
We demonstrate that a video gaming machine containing two consumer graphical cards can outpace a state-of-the-art quad-core processor workstation by a factor of more than 180× in Hartree-Fock energy
Auxiliary-field quantum Monte Carlo calculations of molecular systems with a Gaussian basis.
This work extends the recently introduced phaseless auxiliary-field quantum Monte Carlo approach to any single-particle basis and applies it to molecular systems with Gaussian basis sets and exhibits a better overall accuracy and a more uniform behavior than CCSD(T).
Quantum Chemistry on Graphical Processing Units. 2. Direct Self-Consistent-Field Implementation.
The use of graphical processing units (GPUs) are demonstrated to carry out complete self-consistent-field calculations for molecules with as many as 453 atoms (2131 basis functions) using coarse and fine-grained parallelism.
Hybrid CPU/GPU Integral Engine for Strong-Scaling Ab Initio Methods.
A parallel integral algorithm for two-electron contributions occurring in Hartree-Fock and hybrid density functional theory that allows for a strong scaling parallelization on inhomogeneous compute clusters and a general strategy to use large basis sets like quadruple-ζ split valence on GPUs.
Chemical Transformations Approaching Chemical Accuracy via Correlated Sampling in Auxiliary-Field Quantum Monte Carlo.
A correlated sampling methodology for AFQMC which relies on error cancellation to dramatically accelerate the calculation of energy differences of relevance to chemical transformations and which is capable of calculating redox properties, deprotonation free energies, and hydrogen abstraction energies in an efficient manner without sacrificing accuracy.
Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation.
It is demonstrated that Graphical Processing Units (GPUs) can be used very efficiently to calculate two-electron repulsion integrals over Gaussian basis functions, the first step in most quantum chemistry calculations.