Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment

@article{Chiang1992EvaluatingDC,
  title={Evaluating Design Choices for Shared Bus Multiprocessors in a Throughput-Oriented Environment},
  author={MenChow Chiang and Gurindar S. Sohi},
  journal={IEEE Trans. Computers},
  year={1992},
  volume={41},
  pages={297-317}
}
The authors consider the evaluation of design choices in multiprocessors with a single, shared bus interconnect operating in an environment in which each task is being executed on a single processor and the performance of the multiprocessor is measured by its overall throughput. To evaluate design choices, they develop mean value analysis analytical models and validate the models by comparing their results against the results of a trace-driven simulation analysis for 5376 multiprocessor… 
Experience with mean value analysis model for evaluating shared bus, throughput-oriented multiprocessors
TLDR
Wc reports on the experience with the accuracy of mean value analysis analytical models for evaluating shared bus multiprocessors operating in a throughput-oriented environment and finds that the analytical models are accurate in predicting the individual processor throughputs and partial bus utilizations.
Performance modelling and evaluation for the XMP shared-bus multiprocessor architecture
TLDR
The features of the SSTP bus scheme as well as two important performance impacting factors: cache, bus, and memory interferences and DMA transfer are modelled to assist evaluating the architectural alternatives of XMP.
Performance analysis of a separate address/data bus multiprocessor system
TLDR
Results show that the values of some key design parameters, such as cache line size and data‐bus width that yield the best throughput, are dependent on the performance of subsystems.
A Subsystem-Oriented Performance Analysis Methodology for Shared-Bus Multiprocessors
TLDR
The subsystem-oriented view of the proposed methodology facilitates divide-and-conquer modeling and bottleneck analysis, which is rarely addressed previously, and leads to a simple, general, and systematic approach to the analytical modeling and analysis of complex multiprocessor systems.
An Easy-to-Use Approach for Practical Bus-Based System Design
TLDR
The model relates the shared-bus width, bus cycle time, cache memory, the features of a program execution, and the number of processors on a shared bus to a metric called request utilization, which acts as the scaling factor for the effective average waiting processors in computing the queuing delay cycles.
An analytical model of high performance superscalar-based multiprocessors
TLDR
An MVA multiprocessor performance model is presented which incorporates new features of superscalar processors and in addition, increases the level of modeling detail to improve exibility and accuracy.
A Multiprocessor Bus Design Model Validated by System Measurement
TLDR
The model is shown to accurately predict measured system performance for two parallel program workloads that have different memory access characteristics and provides evidence that analytic queueing models can be extremely accurate in spite of simplifying assumptions required for model tractability.
Tradeoffs in the Design of Single Chip Multiprocessors
TLDR
This paper proposes to analyze the tradeoos involved in designing a multiprocessor system on a single chip, and addresses whether to allocate available chip area to larger caches or to large numbers of processors, and shows that adding processors at the expense of cache size improves performance up to a point.
Performance Evaluation and Modeling Techniques for Parallel Processors
TLDR
It is clear that parallel processors of the future will be required to offer the user a time-shared environment with reasonable response times for the applications, and this is especially true for parallel processors, where the costs and benefits of multi-user workloads are exacerbated.
A Mean Value Analysis Multiprocessor Model Incorporating Superscalar Processors and Latency Tolerating Techniques
TLDR
An analytical performance model is presented which extends previous multiprocessor MVA models by incorporating these new features and in addition, increases the level of modeling detail to improve flexibility and accuracy.
...
...

References

SHOWING 1-10 OF 29 REFERENCES
Experience with mean value analysis model for evaluating shared bus, throughput-oriented multiprocessors
TLDR
Wc reports on the experience with the accuracy of mean value analysis analytical models for evaluating shared bus multiprocessors operating in a throughput-oriented environment and finds that the analytical models are accurate in predicting the individual processor throughputs and partial bus utilizations.
Performance analysis of multiprocessor cache consistency protocols using generalized timed Petri nets
TLDR
An exact analytical technique is used, based on Generalized Timed Petri Nets, to study the performance of shared bus cache consistency protocols for multiprocessors and quantitatively assess the performance gains for each of the four enhancements.
Cache coherence protocols: evaluation using a multiprocessor simulation model
TLDR
The magnitude of the potential performance difference between the various approaches indicates that the choice of coherence solution is very important in the design of an efficient shared-bus multiprocessor, since it may limit the number of processors in the system.
An accurate and efficient performance analysis technique for multiprocessor snooping cache-consistency protocols
A family of dynamic cache-consistency-protocols for shared-bus multiprocessor systems is considered. A modeling approach, based on the specification and the iterative solution of sets of equations
Modeling Bus Contention and Memory Interference in a Multiprocessor System
TLDR
Stochastic models of contention for shared resources in an experimental multiprocessor prototype are presented and are validated with simulation and measurement results that show that the accuracy of the analytical results is excellent.
A characterization of sharing in parallel programs and its application to coherency protocol evaluation
  • S. Eggers, R. Katz
  • Computer Science
    [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings
  • 1988
TLDR
The results indicate that the amount of write sharing in all programs is small, and that it is characterized by short-to-medium sequences of per-processor references, with little contention for either data or locks.
Analysis of Multiprocessors with Private Cache Memories
  • J. Patel
  • Engineering
    IEEE Transactions on Computers
  • 1982
TLDR
An approximate analytical model for the performance of multiprocessors with private cache memories and a single shared main memory is presented and is found to be very good over a broad range of parameters.
Aspects of Cache Memory and Instruction
TLDR
Techniques are developed in this dissertation to efficiently evaluate direct-mapped and set-associative caches for single-chip RISC microprocessors, and it is demonstrated that instruction buffers will be preferred to target instruction buffers in future RISCmicroprocessors implemented on single CMOS chips.
Performance tradeoffs in cache design
TLDR
The tradeoffs between cache size and CPU/cache cycle-time, set associativity and cycle time, and block size and main-memory speed, and the results indicate that neither cycle time nor cache size dominates the other across the entire design space.
...
...