• Corpus ID: 15197353

AN INTEgRATED SIMulATIoN INfRASTRuCTuRE foR THE ENTIRE MEMoRy HIERARCHy: CACHE, DRAM, NoNVolATIlE MEMoRy, AND DISk

@inproceedings{Lu2013ANIS,
  title={AN INTEgRATED SIMulATIoN INfRASTRuCTuRE foR THE ENTIRE MEMoRy HIERARCHy: CACHE, DRAM, NoNVolATIlE MEMoRy, AND DISk},
  author={Shih-Lien Lu and Bruce Jacob},
  year={2013}
}
As computer systems evolve towards exascale and attempt to meet new application requirements such as big data, conventional memory technologies and architectures are no longer adequate in terms of bandwidth, power, capacity, or resilience. In order to understand these problems and analyze potential solutions, an accurate simulation environment that captures all of the complex interactions of the modern computer system is essential. In this article, we present an integrated simulation… 

Figures and Tables from this paper

FAME: A Fast and Accurate Memory Emulator for New Memory System Architecture Exploration

TLDR
This work uses QPI FPGA that is mainly used to accelerate computations to bring fast, cache-coherent DRAM emulation to systems that relied only on limited capabilities of simulation and NUMA platforms, and describes an integrated methodology that helps navigate DRAM memory caching and new technologies' microarchitectural timing simulation.

NVMain Extension for Multi-Level Cache Systems

TLDR
This paper extends the cache model of the NVMain memory simulator by introducing an SRAM cache model and its supporting modules and provides a reference implementation of an optimized cache organization scheme for die-stacked DRAM cache along with a tag-cache unit that, together, reduces cache miss latency.

Simulating DRAM controllers for future system architecture exploration

TLDR
This work presents a high-level memory controller model, specifically designed for full-system exploration of future system architectures, that captures the most important DRAM timing constraints for current and emerging DRAM interfaces, e.g. DDR3, LPDDR3 and WideIO.

Siena: Exploring the Design Space of Heterogeneous Memory Systems

  • I. PengJ. Vetter
  • Computer Science
    SC18: International Conference for High Performance Computing, Networking, Storage and Analysis
  • 2018
TLDR
This paper systematically explore the organization of heterogeneous memory systems on a framework called Siena, which facilitates quick exploration of memory architectures with flexible configurations of memory systems and realistic memory workloads.

Performance Impact of Emerging Memory Technologies on Big Data Applications: A Latency-Programmable System Emulation Approach

TLDR
A memory-latency programmable emulator, which is based on a FPGA-attached server system, is presented, and the benefit of high capacity memory could outweigh the performance loss due to longer memory latency.

The Case for Associative DRAM Caches

TLDR
This work makes the case that despite these problems, associativity is still a desirable feature for DRAM caches by demonstrating the benefits of associativity for a wide range of cache configurations and workloads.

Dual-Page Checkpointing

TLDR
A new dual-page checkpointing system is designed, which achieves low metadata cost and eliminates most excessive NVM writes at the same time, and breaks the traditional trade-off between metadata space cost and extra data writes.

Machine learning based design space exploration for hybrid main-memory design

TLDR
A machine learning (ML) based design space exploration (DSE) method that builds predictive models for various responses of a hybrid main-memory system and demonstrates the results in terms of the learning curve characteristics for hyperparameter tuning and the statistical error analyses of the designed predictive models.

VAIL: A Victim-Aware Cache Policy for Improving Lifetime of Hybrid Memory

TLDR
The proposed victim-aware cache policy (VAIL), based on DRAM and NVM hybrid memory system, took the eviction locality of victims from the DRAM cache into consideration, to reduce writebacks to NVM and improve DRAM hit ratio at the same time.

References

SHOWING 1-10 OF 18 REFERENCES

Scalable high performance main memory system using phase-change memory technology

TLDR
This paper analyzes a PCM-based hybrid main memory system using an architecture level model of PCM and proposes simple organizational and management solutions of the hybrid memory that reduces the write traffic to PCM, boosting its lifetime from 3 years to 9.7 years.

CMP Memory Modeling : How Much Does Accuracy Matter ?

TLDR
The necessity for a cycle-accurate memory model for CMP architecture is demonstrated and it is shown that the performance difference among them is increased as the number of cores is increased on-chip.

Buffer-on-board memory systems

TLDR
A hardware-verified simulation suite is developed and full system simulations are performed to better understand how this memory system interacts with an operating system executing an application with the goal of uncovering behaviors not present in simple limit case simulations.

Relaxing non-volatility for fast and energy-efficient STT-RAM caches

TLDR
It is found that a pure STT-RAM cache hierarchy provides the best energy efficiency, though a hybrid design of SRAM-based L1 caches with reduced-retention STt-RAM L2 and L3 caches eliminates performance loss while still reducing the energy-delay product by more than 70%.

Cache decay: exploiting generational behavior to reduce cache leakage power

TLDR
This paper examines methods for reducing leakage power within the cache memories of the CPU by invalidating and "turning off" cache lines when they hold data not likely to be reused, and proposes adaptive decay-based policies that make energy-minimizing policy choices on a per-application basis.

The DiskSim Simulation Environment Version 4.0 Reference Manual (CMU-PDL-08-101)

TLDR
This manual describes how to configure and use DiskSim, which has been made publicly available with the hope of advancing the state-of-the-art in disk system performance evaluation in the research community.

The PARSEC benchmark suite: Characterization and architectural implications

TLDR
This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs), and shows that the benchmark suite covers a wide spectrum of working sets, locality, data sharing, synchronization and off-chip traffic.

The bleak future of NAND flash memory

TLDR
It is shown that future gains in density will come at significant drops in performance and reliability, and SSD manufacturers and users will face a tough choice in trading off between cost, performance, capacity and reliability.

Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits

TLDR
Channel engineering techniques including retrograde well and halo doping are explained as means to manage short-channel effects for continuous scaling of CMOS devices and different circuit techniques to reduce the leakage power consumption are explored.