Techniques For The Trace-Driven Simulation Of Cache Performance

@article{Eggers1989TechniquesFT,
  title={Techniques For The Trace-Driven Simulation Of Cache Performance},
  author={Susan J. Eggers and Edward D. Lazowska and Y. B. Lin},
  journal={1989 Winter Simulation Conference Proceedings},
  year={1989},
  pages={1042-1046}
}
In contemporary computers, cache memories are interposed between processors and primary memories in order to decrease access time and bus traffic. Because the design of the cache is critical and the factors affecting its performance are complex, trace-driven simulation is widely used and studied. This paper surveys three interesting techniques for the trace-driven simulation of cache designs: stack analysis methodologies that make it possible to obtain performance measures for a wide variety of… 

Tables from this paper

Mable: A Technique for Efficient Machine Simulation
TLDR
A framework for an efficient instruction-level machine simulator which can be used with existing software tools to develop and analyze programs for a proposed processor architecture and which has applicability to a diverse set of simulation problems is presented.
Trace Generation for Multiprocessor System
TLDR
A synthetic address trace generation model is proposed which combine the accuracy advantage of trace-driven simulation and the low complexity advantage of discrete event simulation and provides flexibility in characterizing the system workload independt of cache structure.
Interaction Of Cache Coherency And Media Access Protocols In The Optically Interconnected Distributed Memory Environment
TLDR
A synthetic address trace generation model which is independent of the system architecture and with great flexibility in characterizing various workloads is used as an input to the system.
Efficient Simulation of Parallel Computer Systems
TLDR
The resulting system is program-driven, but the overhead is significantly reduced by profiling the program to get timing estimates for its basic blocks, which are then used at run time to generate process execution times dynamically while avoiding a detailed emulation of each instruction's execution.

References

SHOWING 1-10 OF 24 REFERENCES
Aspects of cache memory and instruction buffer performance
TLDR
Techniques are developed in this dissertation to efficiently evaluate direct-mapped and set-associative caches and examine instruction caches for single-chip RISC microprocessors, and it is demonstrated that instruction buffers will be preferred to target instruction buffers in future RISCmicroprocessors implemented on single CMOS chips.
Aspects of Cache Memory and Instruction
TLDR
Techniques are developed in this dissertation to efficiently evaluate direct-mapped and set-associative caches for single-chip RISC microprocessors, and it is demonstrated that instruction buffers will be preferred to target instruction buffers in future RISCmicroprocessors implemented on single CMOS chips.
Efficient Analysis of Caching Systems
TLDR
These techniques are significant extensions to the stack analysis technique (Mattson et al., 1970) which computes the read miss ratio for all cache sizes in a single trace-driven simulation, and are used to study caching in a network file system.
A low-overhead coherence solution for multiprocessors with private cache memories
This paper presents a cache coherence solution for multiprocessors organized around a single time-shared bus. The solution aims at reducing bus traffic and hence bus wait time. This in turn increases
Simulation analysis of data-sharing in shared memory multiprocessors
TLDR
Examination of shared memory reference patterns in parallel programs that run on bus-based, shared memory multiprocessors reveals two distinct modes of sharing behavior: sequential sharing and fine-grain sharing.
Evaluating Associativity in CPU Caches
TLDR
All-associativity simulation is theoretically less efficient than forest simulation or stack simulation (a commonly used simulation algorithm), in practice it is not much slower and allows the simulation of many more caches with a single pass through an address trace.
Two Methods for the Efficient Analysis of Memory Address Trace Data
  • A. Smith
  • Computer Science
    IEEE Transactions on Software Engineering
  • 1977
TLDR
There is little or no loss in accuracy using reduced traces for many purposes for a wide range of memory sizes and degrees of reduction.
Evaluation Techniques for Storage Hierarchies
TLDR
A new and efficient method of determining, in one pass of an address trace, performance measures for a large class of demand-paged, multilevel storage systems utilizing a variety of mapping schemes and replacement algorithms.
Analysis of cache replacement-algorithms
TLDR
The model shows that the majority of the cache misses that OPT avoids over LRU come from the most-recently-discarded lines of the LRU cache, which leads to three realizable near-optimal replacement algorithms that try to duplicate the replacement decisions made by OPT.
Implementing a cache consistency protocol
TLDR
The protocol and its VLSI realization are described in some data, to emphasize the important implementation issues, in particular, the controller critical sections and the inter- and intra-cache interlocks needed to maintain cache consistency.
...
...