Share This Author
CompostBin: A DNA Composition-Based Algorithm for Binning Environmental Shotgun Reads
The development of CompostBin is reported, a DNA composition-based algorithm for analyzing metagenomic sequence reads and distributing them into taxon-specific bins that has the ability to accurately bin raw sequence reads without need for assembly or training.
Accelerating Numerical Dense Linear Algebra Calculations with GPUs
This chapter presents the current best design and implementation practices for the acceleration of dense linear algebra (DLA) on GPUs, from the matrix–matrix multiplication kernel written in CUDA to the higher level algorithms for solving linear systems, eigenvalue and SVD problems.
- C. Newburn, G. Bansal, Jesús Labarta
- Computer ScienceIEEE International Parallel and Distributed…
- 1 May 2016
It is shown how a simple FIFO streaming model can be applied to heterogeneous systems that include manycore coprocessors and multicore CPUs, and how it enables tuning experts and runtime systems to tailor execution for different heterogeneous targets.
Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster
- I. Yamazaki, S. Rajamanickam, E. Boman, M. Hoemmen, M. Heroux, S. Tomov
- Computer ScienceSC14: International Conference for High…
- 16 November 2014
This paper presents the implementation of a CA variant of the Generalized Minimum Residual (GMRES) method, called CAGMRES, for solving no symmetric linear systems of equations on a hybrid CPU/GPU cluster, and outlines a domain decomposition framework to introduce a family of preconditioners that are suitable for CA Krylov methods.
Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs
This paper analyzes the numerical properties of the mixed-precision CholQR, which requires only one global reduction between the parallel processing units and performs most of its computation using BLAS-3 kernels.
Improving the Performance of CA-GMRES on Multicores with Multiple GPUs
- I. Yamazaki, H. Anzt, S. Tomov, M. Hoemmen, J. Dongarra
- Computer ScienceIEEE 28th International Parallel and Distributed…
- 19 May 2014
This study investigates several optimization techniques for the GPU kernels that can also be used in other iterative solvers besides GMRES, and provides insight about the effects of these optimization techniques on the performance of the sparse solvers, and may have greater impact beyond GMRES.
Segmenting Point Sets
- I. Yamazaki, V. Natarajan, Z. Bai, B. Hamann
- Computer ScienceIEEE International Conference on Shape Modeling…
- 14 June 2006
This work introduces a technique for segmenting a point-sampled surface into distinct features without explicit construction of a mesh or other surface representation and applies its segmentation algorithm on laser-scanned models to evaluate its ability to capture geometric features in complex data sets.
Performance of asynchronous optimized Schwarz with one-sided communication
Optimizing Krylov Subspace Solvers on Graphics Processing Units
- H. Anzt, W. Sawyer, S. Tomov, P. Luszczek, I. Yamazaki, J. Dongarra
- Computer ScienceIEEE International Parallel & Distributed…
- 19 May 2014
This paper targets the acceleration of the BiCGSTAB solver for GPUs, showing that significant improvement can be achieved by reformulating the method and developing application-specific kernels instead of using the generic CUBLAS library provided by NVIDIA.
Numerical Methods for Quantum Monte Carlo Simulations of the Hubbard Model
One of the core problems in materials science is how the interactions between electrons in a solid give rise to properties like ∗This work was partially supported by the National Science Foundation…