#### Filter Results:

- Full text PDF available (5)

#### Publication Year

2012

2016

- This year (0)
- Last 5 years (7)
- Last 10 years (7)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Jamie Liu, Ben Jaiyen, Richard Veras, Onur Mutlu
- 2012 39th Annual International Symposium on…
- 2012

Dynamic random-access memory (DRAM) is the building block of modern main memory systems. DRAM cells must be periodically refreshed to prevent loss of data. These refresh operations waste energy and degrade system performance by interfering with memory accesses. The negative effects of DRAM refresh increase as DRAM device capacity increases. Existing DRAM… (More)

Stencil computations are an integral component of applications in a number of scientific computing domains. Short-vector SIMD instruction sets are ubiquitous on modern processors and can be used to significantly increase the performance of stencil computations. Traditional approaches to optimizing stencils on these platforms have focused on either… (More)

Data locality and parallelism are critical optimization objectives for performance on modern multi-core machines. Both coarse-grain parallelism (e.g., multi-core) and fine-grain parallelism (e.g., vector SIMD) must be effectively exploited, but despite decades of progress at both ends, current compiler optimization schemes that attempt to address data… (More)

- Richard Veras, Franz Franchetti
- VECPAR
- 2014

Matrix-Matrix Multiplication (MMM) is a fundamental operation in scientific computing. Achieving the floating point peak with this operation requires expert knowledge of linear algebra and computer architecture to craft a tuned implementation, for a given microarchitecture. The expert follows a mechanical process for implementing MMM, by deriving the… (More)

- Richard Veras, Doru-Thom Popovici, Tze Meng Low, Franz Franchetti
- WPMVP@PPoPP
- 2016

Achieving high performance for compute bounded numerical kernels typically requires an expert to hand select an appropriate set of Single-instruction multiple-data (SIMD) instructions, then statically scheduling them in order to hide their latency while avoiding register spilling in the process. Unfortunately, this level of control over the code forces the… (More)

- Richard Veras, Tze Meng Low, Franz Franchetti
- 2016 IEEE High Performance Extreme Computing…
- 2016

Many real-world graphs, such as those that arise from the web, biology and transportation, appear random and without a structure that can be exploited for performance on modern computer architectures. However, these graphs have a scale-free graph topology that can be leveraged for locality. Existing sparse data formats are not designed to take advantage of… (More)

- Tom Henretty, Justin Holewinski, +5 authors P. Sadayappan
- 2013

Stencil computations are an integral part of applications in a number of scientific computing domains, such as image processing and partial differential equations. We describe a domain-specific language for regular stencil computations, that allows specification of the computations in a concise manner. We describe a multi-target compiler for this DSL, that… (More)

- ‹
- 1
- ›