• Publications
  • Influence
Architecting phase change memory as a scalable dram alternative
TLDR
This work proposes, crafted from a fundamental understanding of PCM technology parameters, area-neutral architectural enhancements that address these limitations and make PCM competitive with DRAM. Expand
Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors
TLDR
This paper exposes the vulnerability of commodity DRAM chips to disturbance errors, and shows that it is possible to corrupt data in nearby addresses by reading from the same address in DRAM by activating the same row inDRAM. Expand
Base-delta-immediate compression: Practical data compression for on-chip caches
TLDR
There is a need for a simple yet efficient compression technique that can effectively compress common in-cache data patterns, and has minimal effect on cache access latency. Expand
RAIDR: Retention-aware intelligent DRAM refresh
TLDR
This paper proposes RAIDR (Retention-Aware Intelligent DRAM Refresh), a low-cost mechanism that can identify and skip unnecessary refreshes using knowledge of cell retention times and group DRAM rows into retention time bins and apply a different refresh rate to each bin. Expand
A scalable processing-in-memory accelerator for parallel graph processing
TLDR
This work argues that the conventional concept of processing-in-memory (PIM) can be a viable solution to achieve memory-capacity-proportional performance and designs a programmable PIM accelerator for large-scale graph processing called Tesseract. Expand
A case for bufferless routing in on-chip networks
TLDR
A case is made for a new approach to designing on-chip interconnection networks that eliminates the need for buffers for routing or flow control and new algorithms for routing without using buffers in router input/output ports are described. Expand
Parallelism-Aware Batch Scheduling: Enhancing both Performance and Fairness of Shared DRAM Systems
TLDR
A parallelism-aware batch scheduler that seamlessly incorporates support for system-level thread priorities and can provide different service levels, including purely opportunistic service, to threads with different priorities, and is also simpler to implement than STFM. Expand
Ramulator: A Fast and Extensible DRAM Simulator
TLDR
This paper presents Ramulator, a fast and cycle-accurate DRAM simulator that is built from the ground up for extensibility, and is able to provide out-of-the-box support for a wide array of DRAM standards. Expand
Personalized Copy-Number and Segmental Duplication Maps using Next-Generation Sequencing
TLDR
An algorithm (mrFAST) is presented to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes, and can distinguish between different copies of highly identical genes. Expand
Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior
TLDR
This paper presents a new memory scheduling algorithm that addresses system throughput and fairness separately with the goal of achieving the best of both, and evaluates TCM on a wide variety of multiprogrammed workloads and compares its performance to four previously proposed scheduling algorithms, finding that TCM achieves both the best system throughputand fairness. Expand
...
1
2
3
4
5
...