Wayne Joubert

Learn More
The solution of nonsymmetric systems of linear equations continues to be a diicult problem. A main algorithm for solving nonsymmetric problems is restarted GMRES. The algorithm is based on restarting full GMRES every s iterations, for some integer s>0. This paper considers the impact of the restart frequency s on the convergence and work requirements of the(More)
In this study a hybrid generalized minimal residual (GMRES) polynomial preconditioning algorithm for solving nonsymmetric systems oflinear equations is defined. The algorithm uses the results from cycles ofrestarted GMRES to form an effective polynomial preconditioner, typically resulting in decreased work requirements. The algorithm has the advantage over(More)
New parallel computers are emerging, but developing efficient scientific code for them remains difficult. A scientist must manage not only the science-domain complexity but also the performance-optimization complexity. HERCULES is a code transformation system designed to help the scientist to separate the two concerns, which improves code maintenance, and(More)
The use of computational accelerators such as NVIDIA GPUs and Intel Xeon Phi processors is now widespread in the high performance computing community, with many applications delivering impressive performance gains. However, programming these systems for high performance, performance portability and software maintainability has been a challenge. In this(More)
We present simulations of blood and cancer cell separation in complex microfluidic channels with subcellular resolution, demonstrating unprecedented time to solution, performing at 65.5% of the available 39.4 PetaInstructions/s in the 18, 688 nodes of the Titan supercomputer. These simulations outperform by one to three orders of magnitude the current(More)
In this paper two new implementations of SSOR and incomplete factorization preconditioners are given, for shared memory and distributed memory parallel computers respectively. These new implementations give increased solution speeds for matrix problems such as those arising from discretized partial differential equations with natural ordering of the grid(More)