- Full text PDF available (19)
- This year (0)
- Last 5 years (8)
- Last 10 years (22)
Journals and Conferences
An updated take on Amdahl's analytical model uses modern design constraints to analyze many-core design alternatives. The revised models provide computer architects with a better understanding of many-core design types, enabling them to make more informed tradeoffs.
As technology scaling poses a threat to DRAM scaling due to physical limitations such as limited charge, alternative memory technologies including several emerging non-volatile memories are being explored as possible DRAM replacements. One main roadblock for wider adoption of these new memories is the limited write endurance, which leads to wear-out related… (More)
Phase change memory (PCM) is an emerging memory technology for future computing systems. Compared to other non-volatile memory alternatives, PCM is more matured to production, and has a faster read latency and potentially higher storage density. The main roadblock precluding PCM from being used, in particular, in the main memory hierarchy, is its limited… (More)
Memory bandwidth has become a major performance bottleneck as more and more cores are integrated onto a single die, demanding more and more data from the system memory. Several prior studies have demonstrated that this memory bandwidth problem can be addressed by employing a 3D-stacked memory architecture, which provides a wide, high frequency memory-bus… (More)
Several recent works have demonstrated the benefits of through-silicon-via (TSV) based 3D integration [1-4], but none of them involves a fully functioning multicore processor and memory stacking. 3D-MAPS (3D Massively Parallel Processor with Stacked Memory) is a two-tier 3D IC, where the logic die consists of 64 general-purpose processor cores running at… (More)
We describe the design and analysis of 3D-MAPS, a 64-core 3D-stacked memory-on-processor running at 277 MHz with 63 GB/s memory bandwidth, sent for fabrication using Tezzaron's 3D stacking technology. We also describe the design flow used to implement it using industrial 2D tools and custom add-ons to handle 3D specifics.
Due to the ever-increasing design complexity and physical constraint in frequency scaling, chip multiprocessors are considered the de facto architecture baseline for future processor generation. Through resource sharing, applications running on a CMP can achieve better resource utilization and faster inter-core communication , leading to a higher overall… (More)
Single-chip CPU/GPU architecture is being adopted in high-end (embedded) systems, e.g., smartphones and tablet PCs. Main memory subsystem is expected to consist of hybrid DRAM and phase-change RAM (PRAM) due to the difficulties in DRAM scaling. In this work, we address the performance optimization of the hybrid DRAM/PRAM main memory for single chip CPU/GPU.… (More)
Virtual caches are employed as L1 caches of both high performance and embedded processors to meet their short latency requirements. However, they also introduce the synonym problem where the same physical cache line can be present at multiple locations in the cache due to their distinct virtual addresses, leading to potential data consistency issues. To… (More)
—This paper describes the architecture, design, analysis, and simulation and measurement results of the 3D-MAPS (3D massively parallel processor with stacked memory) chip built with a 1.5 V, 130 nm process technology and a two-tier 3D stacking technology using 1.2 μ-diameter, 6 μ-height through-silicon vias (TSVs) and μ-diameter face-to-face bond pads.… (More)