Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations

  title={Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations},
  author={Martin Kronbichler and Dmytro Sashko and Peter Munch},
This work investigates a variant of the conjugate gradient (CG) method and embeds it into the context of high-order finite-element schemes with fast matrix-free operator evaluation and cheap preconditioners like the matrix diagonal. Relying on a data-dependency analysis and appropriate enumeration of degrees of freedom, we interleave the vector updates and inner products in a CG iteration with the matrix-vector product with only minor organizational overhead. As a result, around 90% of the… 
2 Citations
The deal.II Library, Version 9.4
An overview of the new features of the finite element library deal, version 9.4.II is provided.


Efficient High-Order Discontinuous Galerkin Finite Elements with Matrix-Free Implementations
This work proposes an element-based shared- memory parallelization option and compares it to a well-established shared-memory parallelization with global face data and shows that merging the more arithmetically heavy operator evaluation with vector operations in application code allows to more than double efficiency on the latest hardware.
A study of vectorization for matrix-free finite element methods
This work studies cross-element vectorization in the finite element framework Firedrake via code transformation and demonstrates the efficacy of such an approach by evaluating a wide range of matrix-free operators spanning different polynomial degrees and discretizations on two recent CPUs using three mainstream compilers.
A generic interface for parallel cell-based finite element operator application
Efficient Nonlinear Solvers for Nodal High-Order Finite Elements in 3D
This work presents a method in which the action of the Jacobian is applied matrix-free exploiting a tensor product basis on hexahedral elements, while much sparser matrices based on Q1 sub-elements on the nodes of the high-order basis are assembled for preconditioning.
Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm
Fast Matrix-Free Evaluation of Discontinuous Galerkin Finite Element Operators
We present an algorithmic framework for matrix-free evaluation of discontinuous Galerkin finite element operators. It relies on fast quadrature with sum factorization on quadrilateral and hexahedral
Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors
A GPU parallelization of a matrix-free geometric multigrid iterative solver targeting moderate and high polynomial degrees, with support for general curved and adaptively refined hexahedral meshes with hanging nodes is developed.
Reformulated Conjugate Gradient for the Energy-Aware Solution of Linear Systems on GPUs
A redesign of the conjugate gradient method for the iterative solution of sparse linear systems on heterogeneous systems accelerated by graphics processing units (GPUs) results in a slight increase of performance and enables the efficient exploitation of power-saving techniques implicit in the hardware.
Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes
A GPU parallelization of the matrix-free method including a novel algorithm for resolving hanging-node constraints on the GPU, capable of simulation on adaptively refined grids and can solve problems 8 times larger in 3D.
hyper.deal: An Efficient, Matrix-free Finite-element Library for High-dimensional Partial Differential Equations
This work presents the efficient, matrix-free finite-element library hyper.deal for solving partial differential equations in two to six dimensions with high-order discontinuous Galerkin methods and reports results for high-dimensional advection problems and for the solution of the Vlasov--Poisson equation in up to 6D phase space.