• Corpus ID: 9423663

Techniques for Cache and Memory Simulation Using Address Reference Traces

@article{Holliday1991TechniquesFC,
  title={Techniques for Cache and Memory Simulation Using Address Reference Traces},
  author={Mark A. Holliday},
  journal={Int. J. Comput. Simul.},
  year={1991},
  volume={1}
}
Simulation using address reference traces is one of the primary methods for the performance evaluation of the memory hierarchy of computer systems. [] Key Method A relatively new technique, inline simulation, attempts to avoid a number of the problems associated with traditional trace-driven simulation. Index Terms address reference traces, trace-driven simulation, survey, inclusion property, trace reduction, one-pass simulation, parallel traces, global trace problem, inline simulation. 2

Accuracy of Memory Reference Traces of Parallel Computations in Trace-Driven Simulation

TLDR
An extension of traditional process traces, called the intrinsic trace of each process, is developed, which maximize the decoupling of program execution from simulation by describing the address flow graph and path expressions ofEach process program.

Efficient Full System Memory Tracing with Simutrace

TLDR
Memory traces heavily depend on an efficient encoding, a scalable trace format and a tracing mechanism that is capable of dealing with a high rate of incoming events.

Real-time L3 cache simulations using the Programmable Hardware-Assisted Cache Emulator (PHA$E)

TLDR
This paper discusses the design and implementation of the PHA$E, a system for emulating cache in real-time, and the results of simulating varying sizes of off-chip L3 caches on various workloads are presented and analyzed.

Simutrace: A Toolkit for Full System Memory Tracing

TLDR
Simutrace is presented, a tracing framework for efficient full system memory tracing, using functional full system simulation for holistic memory traces, and incorporates an aggressive but fast compressor, making full length, no-loss memory traces of long-running workloads possible.

Trap-driven memory simulation

TLDR
This dissertation compares the trace-driven and trap-driven methods on the basis of their flexibility, portability, speed, and accuracy and shows thattrap-driven simulation offers clear advantages over trace- driven simulation with respect to speed and accuracy.

Using static program analysis to compile fast cache simulators

TLDR
This thesis presents a generic approach towards compiling fast execution-driven simulators, and applies this to cache simulationof programs, to reduce the time needed for cache performance evaluations without losing the accuracy of the results.

Trap-driven memory simulation with Tapeworm II

TLDR
Both the strengths and the weaknesses of trap-driven simulation are exposed with respect to speed, accuracy, completeness, portability, flexibility, ease-of-use, and memory overhead.

Constructing multiprocessor workload characterizations

TLDR
This work traces a single processor system executing an N-processor workload, then performs static analysis on the trace and produces individual process characterizations that can be used to build input workloads for models of multiple processor systems.

Efficient Generation Of Synthetic Traces

  • L. BarrigaR. Ayani
  • Computer Science
    Proceedings. Second Euromicro Workshop on Parallel and Distributed Processing
  • 1994
TLDR
An analysis of the ST model and its implementation shows that the bottlenecks lie in hyperbolic function computations and updating large LRU stacks.

THE TIMEKEEPING METHODOLOGY: EXPLOITING GENERATIONAL LIFETIME BEHAVIOR TO IMPROVE PROCESSOR POWER AND PERFORMANCE

TLDR
This thesis demonstrates how processors can be optimized by exploiting knowledge about time durations between key processor and memory events by exploiting characteristics of key timekeeping metrics.

References

SHOWING 1-10 OF 71 REFERENCES

Accuracy of Memory Reference Traces of Parallel Computations in Trace-Driven Simulation

TLDR
An extension of traditional process traces, called the intrinsic trace of each process, is developed, which maximize the decoupling of program execution from simulation by describing the address flow graph and path expressions ofEach process program.

Evaluating the performance of software cache coherence

TLDR
An analytical model of the performance of two software coherence schemes and, for comparison, snoopy-cache hardware is presented and it is determined that both scale well.

TRAPEDS: producing traces for multicomputers via execution driven simulation

TLDR
A new technique is presented in this paper which modifies the executable code to dynamically collect the address trace from the user code and analyzes this trace during the execution of the program, which helps resolve the I/O and storage problems and facilitates parallel analysis of the address Trace.

Memory-reference characteristics of multiprocessor applications under MACH

TLDR
The amount of sharing in user programs and in the operating system, comparing the characteristics of user and system reference patterns, sharing related to process migration, and the temporal, spatial, and processor locality of shared blocks are addressed.

Aspects of cache memory and instruction buffer performance

TLDR
Techniques are developed in this dissertation to efficiently evaluate direct-mapped and set-associative caches and examine instruction caches for single-chip RISC microprocessors, and it is demonstrated that instruction buffers will be preferred to target instruction buffers in future RISCmicroprocessors implemented on single CMOS chips.

Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems

TLDR
New methods of simulation based on statistical techniques are proposed for decreasing the need for large trace measurements and for predicting true program behavior, and a new concept of primed cache is introduced to simulate large caches by the sampling-based method.

Generation and analysis of very long address traces

  • A. BorgR. KesslerD. W. Wall
  • Computer Science
    [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture
  • 1990
TLDR
A trace-generation mechanism based on link-time code modification which is simple to use, generates accurate long traces of multiuser programs, runs on a RISC (reduced-instruction-set-computer) machine, and can be flexibly controlled is used.

Cache memory performance in a unix enviroment

TLDR
The intent is to credibly quantify the performance implications of parameter selection in a manner which emphasizes implementation tradeoffs using address reference traces obtained from typical multitasking UNIX workloads to research cache memory performance.

ATUM: a new technique for capturing address traces using microcode

TLDR
A new technique has been developed to use a processor's microcode to record addresses in a reserved part of main memory as a side effect of normal execution, making it possible to gather full operating-system traces of multi-tasking workloads.

An analytical cache model

TLDR
An analytical cache model is developed that gives miss rates for a given trace as a function of cache size, degree of associativity, block size, subblock size, multiprogramming level, task switch interval, and observation interval.
...