#### Filter Results:

#### Publication Year

2012

2015

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

Time-tiling is necessary for the efficient execution of iterative stencil computations. Classical hyper-rectangular tiles cannot be used due to the combination of backward and forward dependences along space dimensions. Existing techniques trade temporal data reuse for inefficiencies in other areas, such as load imbalance, redundant computations, or… (More)

Stencil computations arise in many scientific computing domains, and often represent time-critical portions of applications. There is significant interest in offloading these computations to high-performance devices such as GPU accelerators, but these architectures offer challenges for developers and compilers alike. Stencil computations in particular… (More)

- Jeswin Godwin, Justin Holewinski, P. Sadayappan
- GPGPU@ASPLOS
- 2012

In this paper, we address efficient sparse matrix-vector multiplication for matrices arising from structured grid problems with high degrees of freedom at each grid node. Sparse matrix-vector multiplication is a critical step in the iterative solution of sparse linear systems of equations arising in the solution of partial differential equations using… (More)

- Justin Holewinski, Ragavendar Ramamurthi, +4 authors P. Sadayappan
- PLDI
- 2012

Recent hardware trends with GPUs and the increasing vector lengths of SSE-like ISA extensions for multicore CPUs imply that effective exploitation of SIMD parallelism is critical for achieving high performance on emerging and future architectures. A vast majority of existing applications were developed without any attention by their developers towards… (More)

- Daniel Lowell, Jeswin Godwin, +7 authors Jason Sarich
- SIAM J. Scientific Computing
- 2013

Numerical solutions of nonlinear partial differential equations frequently rely on iterative Newton-Krylov methods, which linearize a finite-difference stencil-based discretization of a problem, producing a sparse matrix with regular structure. Knowledge of this structure can be used to exploit parallelism and locality of reference on modern cache-based… (More)

- Mahesh Ravishankar, Justin Holewinski, Vinod Grover
- GPGPU@PPoPP
- 2015

As architectures evolve, optimization techniques to obtain good performance evolve as well. Using low-level programming languages like C/C++ typically results in architecture-specific optimization techniques getting entangled with the application specification. In such situations, moving from one target architecture to another usually requires a… (More)

- Prashant Singh Rawat, Martin Kong, +6 authors P. Sadayappan
- WOLFHPC@SC
- 2015

Stencil computations are at the core of applications in a number of scientific computing domains. We describe a domain-specific language for regular stencil computations that allows specification of the computations in a concise manner. We describe a multi-target compiler for this DSL, which generates optimized code for GPUa, FPGAs, and multi-core… (More)

- Tom Henretty, Justin Holewinski, +5 authors P. Sadayappan
- 2013

—Stencil computations are an integral part of applications in a number of scientific computing domains, such as image processing and partial differential equations. We describe a domain-specific language for regular stencil computations, that allows specification of the computations in a concise manner. We describe a multi-target compiler for this DSL, that… (More)

- ‹
- 1
- ›