# Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations

@article{Kronbichler2022EnhancingDL, title={Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations}, author={Martin Kronbichler and Dmytro Sashko and Peter Munch}, journal={ArXiv}, year={2022}, volume={abs/2205.08909} }

This work investigates a variant of the conjugate gradient (CG) method and embeds it into the context of high-order finite-element schemes with fast matrix-free operator evaluation and cheap preconditioners like the matrix diagonal. Relying on a data-dependency analysis and appropriate enumeration of degrees of freedom, we interleave the vector updates and inner products in a CG iteration with the matrix-vector product with only minor organizational overhead. As a result, around 90% of the…

## Figures and Tables from this paper

## 3 Citations

### Stage-parallel fully implicit Runge-Kutta implementations with optimal multilevel preconditioners at the scaling limit

- Computer Science
- 2022

We present an implementation of a fully stage-parallel preconditioner for Radau IIA type fully implicit Runge–Kutta methods, which approximates the inverse of A Q from the Butcher tableau by the…

### On the implementation of a robust and efficient finite element-based parallel solver for the compressible Navier-Stokes equations

- Computer ScienceComputer Methods in Applied Mechanics and Engineering
- 2021

### The deal.II library, Version 9.4

- Computer Science, LinguisticsJournal of Numerical Mathematics
- 2022

An overview of the new features of the finite element library deal, version 9.4.II is provided.

## References

SHOWING 1-10 OF 52 REFERENCES

### Efficient High-Order Discontinuous Galerkin Finite Elements with Matrix-Free Implementations

- Computer ScienceAdvances and New Trends in Environmental Informatics
- 2018

This work proposes an element-based shared- memory parallelization option and compares it to a well-established shared-memory parallelization with global face data and shows that merging the more arithmetically heavy operator evaluation with vector operations in application code allows to more than double efficiency on the latest hardware.

### A generic interface for parallel cell-based finite element operator application

- Computer Science
- 2012

### Efficient Nonlinear Solvers for Nodal High-Order Finite Elements in 3D

- Computer ScienceJ. Sci. Comput.
- 2010

This work presents a method in which the action of the Jacobian is applied matrix-free exploiting a tensor product basis on hexahedral elements, while much sparser matrices based on Q1 sub-elements on the nodes of the high-order basis are assembled for preconditioning.

### Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm

- Computer ScienceParallel Comput.
- 2014

### Fast Matrix-Free Evaluation of Discontinuous Galerkin Finite Element Operators

- Computer ScienceACM Trans. Math. Softw.
- 2019

We present an algorithmic framework for matrix-free evaluation of discontinuous Galerkin finite element operators. It relies on fast quadrature with sum factorization on quadrilateral and hexahedral…

### Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

- Computer ScienceACM Trans. Parallel Comput.
- 2019

A GPU parallelization of a matrix-free geometric multigrid iterative solver targeting moderate and high polynomial degrees, with support for general curved and adaptively refined hexahedral meshes with hanging nodes is developed.

### Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes

- Computer ScienceSpringSim
- 2017

A GPU parallelization of the matrix-free method including a novel algorithm for resolving hanging-node constraints on the GPU, capable of simulation on adaptively refined grids and can solve problems 8 times larger in 3D.

### hyper.deal: An Efficient, Matrix-free Finite-element Library for High-dimensional Partial Differential Equations

- Computer ScienceACM Trans. Math. Softw.
- 2021

This work presents the efficient, matrix-free finite-element library hyper.deal for solving partial differential equations in two to six dimensions with high-order discontinuous Galerkin methods and reports results for high-dimensional advection problems and for the solution of the Vlasov--Poisson equation in up to 6D phase space.

### A stencil scaling approach for accelerating matrix-free finite element implementations

- Computer ScienceSIAM J. Sci. Comput.
- 2018

We present a novel approach to fast on-the-fly low order finite element assembly for scalar elliptic partial differential equations of Darcy type with variable coefficients optimized for matrix-free…

### The Communication-Hiding Conjugate Gradient Method with Deep Pipelines

- Computer ScienceArXiv
- 2018

This work extends the pipelined CG method to deeper pipelines, which allows further scaling when the global communication phase is the dominant time-consuming factor and is able to hide communication latency behind computational work.