• Publications
  • Influence
The tao of parallelism in algorithms
TLDR
It is suggested that the operator formulation and tao-analysis of algorithms can be the foundation of a systematic approach to parallel programming. Expand
Lonestar: A suite of parallel irregular programs
TLDR
The first five programs from the Lonestar benchmark suite are characterized, which target domains like data mining, survey propagation, and design automation, and it is shown that even such irregular applications often expose large amounts of parallelism in the form of amorphous data-parallelism. Expand
Accelerating multicore reuse distance analysis with sampling and parallelization
TLDR
A sampled, parallelized method of measuring reuse distance proiles for multithreaded programs, modeling private and shared cache configurations, and using O(1) data structures that may be made thread-private to reduce overhead in analysis mode. Expand
Optimistic parallelism requires abstractions
TLDR
The design and implementation of a programming abstractions that permit programmers to highlight opportunities for exploiting parallelism in sequential programs are described, and a runtime system that uses these hints to execute the program in parallel is described. Expand
OCTET: capturing and controlling cross-thread dependences efficiently
TLDR
A new software-based concurrency control mechanism called OCTET is introduced that soundly captures cross-thread dependences and can be used to build dynamic analyses for concurrency correctness and suggest that OCTET can provide a foundation for developing low-overhead analyses that check and enforce concurrency Correctness. Expand
Optimistic parallelism requires abstractions
TLDR
It is shown that Delaunay mesh generation and agglomerative clustering can be parallelized in a straight-forward way using the Galois approach, and results suggest that Galois is a practical approach to exploiting data parallelism in irregular programs. Expand
Techniques for Fine-Grained, Multi-site Computation Offloading
TLDR
This paper describes algorithmic approaches for performing fine-grained, multi-site offloading, based on a novel partitioning algorithm, and a novel data representation that allows portions of an application to be offloaded in a data-centric manner, even if that data exists at multiple sites. Expand
How much parallelism is there in irregular applications?
TLDR
The design and implementation of a tool called ParaMeter is described that produces parallelism profiles for irregular programs, and it is explained how these profiles provide insight into the behavior of these applications. Expand
Hybrid Static–Dynamic Analysis for Statically Bounded Region Serializability
TLDR
This paper shows how to provide stronger semantics for racy programs while providing relatively good performance on commodity systems using a novel hybrid static--dynamic analysis called EnfoRSer, which provides end-to-end support for a memory model that is not only stronger than weak memory models but is strictly stronger than SC. Expand
Exploiting the commutativity lattice
TLDR
It is shown how commutativity specifications from this lattice can be systematically implemented in one of three different schemes: abstract locking, forward gatekeeping and general gatekeeping, and it is shown that these schemes are practical and can deliver speedup on three real-world applications. Expand
...
1
2
3
4
5
...