• Publications
  • Influence
STAMP: Stanford Transactional Applications for Multi-Processing
TLDR
This paper introduces the Stanford Transactional Application for Multi-Processing (STAMP), a comprehensive benchmark suite for evaluating TM systems and uses the suite to evaluate six different TM systems, identify their shortcomings, and motivate further research on their performance characteristics. Expand
Transactional memory coherence and consistency
TLDR
To explore the costs and benefits of TCC, the characteristics of an optimal transaction-based memory system are studied, and how different design parameters could affect the performance of real systems are examined. Expand
Evaluating MapReduce for Multi-core and Multiprocessor Systems
TLDR
It is established that, given a careful implementation, MapReduce is a promising model for scalable performance on shared-memory systems with simple parallel code. Expand
Heracles: Improving resource efficiency at scale
TLDR
Heracles is presented, a feedback-based controller that enables the safe colocation of best-effort tasks alongside a latency-critical service and dynamically manages multiple hardware and software isolation mechanisms to ensure that the latency-sensitive job meets latency targets while maximizing the resources given to best- Effort tasks. Expand
ZSim: fast and accurate microarchitectural simulation of thousand-core systems
TLDR
Zsim, a fast, scalable, and accurate simulator, is built using bound-weave, a two-phase parallelization technique that scales parallel simulation on multicore hosts efficiently with minimal loss of accuracy, and lightweight user-level virtualization is implemented to support complex workloads. Expand
Paragon: QoS-aware scheduling for heterogeneous datacenters
TLDR
Paragon is an online and scalable DC scheduler that is heterogeneity and interference-aware, derived from robust analytical methods and uses collaborative filtering techniques to quickly and accurately classify an unknown, incoming workload, by identifying similarities to previously scheduled applications. Expand
The case for RAMClouds: scalable high-performance storage entirely in DRAM
TLDR
This paper argues for a new approach to datacenter storage called RAMCloud, where information is kept entirely in DRAM and large-scale systems are created by aggregating the main memories of thousands of commodity servers. Expand
Towards energy proportionality for large-scale latency-critical workloads
TLDR
PEGASUS is presented, a feedback-based controller that significantly improves the energy proportionality of WSC systems, as demonstrated by a real implementation in a Google search cluster. Expand
An effective hybrid transactional memory system with strong isolation guarantees
TLDR
For certain workloads, SigTM can match the performance of a full-featured hardware TM system, while for workloads with large read-sets it can be up to two times slower. Expand
Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory system
TLDR
This work optimizes Phoenix, a MapReduce runtime for shared-memory multi-cores and multiprocessors, on a quad-chip, 32-core, 256-thread UltraSPARC T2+ system with NUMA characteristics and shows how a multi-layered approach leads to significant speedup improvements with 256 threads. Expand
...
1
2
3
4
5
...