Event pool structures for PDES on many-core Beowulf clusters

@article{Dickman2013EventPS,
  title={Event pool structures for PDES on many-core Beowulf clusters},
  author={T. Dickman and Sounak Gupta and P. Wilsey},
  journal={Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation},
  year={2013}
}
  • T. Dickman, Sounak Gupta, P. Wilsey
  • Published 2013
  • Computer Science
  • Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
Multi-core and many-core processing chips are becoming widespread and are now being widely integrated into Beowulf clusters. This poses a challenging problem for distributed simulation as it now becomes necessary to extend the algorithms to operate on a platform that includes both shared memory and distributed memory hardware. Furthermore, as the number of on-chip cores grows, the challenges for developing solutions without significant contention for shared data structures grows. This is… Expand
A Non-Blocking Priority Queue for the Pending Event Set
TLDR
The design and implementation of a concurrent non-blocking pending events' set data structure, which can be seen as a variant of a classical calendar queue is presented, showing excellent scalability of the proposal on a machine equipped with 32 CPU-cores. Expand
A Lock-Free O(1) Event Pool and Its Application to Share-Everything PDES Platforms
TLDR
This article presents a lock-free event pool which also provides amortized O(1) time complexity for both insertions and extractions, and can sustain highly concurrent accesses, while not leading to noticeable performance degradation when scaling up the thread count. Expand
Lock-free pending event set management in time warp
TLDR
This work assumes that the pending events within any one bucket are causally independent and schedule them for execution without sorting and without consideration of their total time-based order, and uses the Time Warp mechanism to recover whenever actual dependencies arise. Expand
A Conflict-Resilient Lock-Free Calendar Queue for Scalable Share-Everything PDES Platforms
TLDR
This article presents a conflict-resilient non-blocking calendar queue that enables conflicting dequeue operations, concurrently attempting to extract the minimum element, to survive, thus improving the level of scalability of accesses to the hot portion of the data structure---namely the bucket to which the current locality of the events to be processed is bound. Expand
Performance comparison of Cross Memory Attach capable MPI vs. Multithreaded Optimistic Parallel Simulations
  • D. Rao
  • Computer Science
  • SIGSIM-PADS
  • 2018
TLDR
This paper compares the performance of CMA capable, MPI-based version to the authors' fine-tuned multithreaded version and suggests that more in-depth analysis of model characteristics is needed to decide between shared-memory multithreading versus message-passing approaches. Expand
A low-overhead constant-time Lowest-Timestamp-First CPU scheduler for high-performance optimistic simulation platforms
  • F. Quaglia
  • Computer Science
  • Simul. Model. Pract. Theory
  • 2015
TLDR
The article presents a low-overhead constant-time implementation of the well known Lowest-Timestamp-First algorithm for the identification of the next LP to be CPU-dispatched, suited for contexts where the optimistic simulation system conforms to the best-practice of keeping separate event lists for the hosted LPs. Expand
Performance Evaluation of Priority Queues for Fine-Grained Parallel Tasks on GPUs
TLDR
This work performs a performance evaluation of GPU-based priority queue implementations for two applications: discrete-event simulation and parallel A* path searches on grids and presents performance measurements covering linear queue designs, implicit binary heaps, splay trees, and a GPU-specific proposal from the literature. Expand
Time Warp Simulation on Multi-Core Platforms
  • P. Wilsey
  • Computer Science
  • 2019 Winter Simulation Conference (WSC)
  • 2019
TLDR
This tutorial reviews parallel computing and the properties of Discrete Event Simulation (DES) models and then examines the construction of PDES solutions that use the Time Warp Synchronization Mechanism. Expand
Experiments with Hardware-based Transactional Memory in Parallel Simulation
TLDR
Evaluation of both forms of transactional memory found in the Intel Haswell processor, Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM), are evaluated and show that RTM generally outperforms conventional locking mechanisms and that HLE provides consistently better performance than conventional locking mechanism. Expand
EVENTS IN SEQUENTIAL & OPTIMISTIC PARALLEL DISCRETE EVENT SIMULATIONS by
MANAGING PENDING EVENTS IN SEQUENTIAL & OPTIMISTIC PARALLEL DISCRETE EVENT SIMULATIONS by Julius Didier Higiro The choice of data structure for managing and processing pending events in timestampExpand
...
1
2
3
...

References

SHOWING 1-10 OF 28 REFERENCES
Threaded WARPED : An Optimistic Parallel Discrete Event Simulator for Cluster of Multi-Core Machines
TLDR
The modifications made to implement threaded WARPED are explained and the performance capabilities of the two solutions for managing the shared data structures are evaluated. Expand
Using parallel data structures in optimistic discrete event simulation of varying granularity on shared-memory computers
  • S. Prasad, S. I. Sawant, B. Naqib
  • Computer Science
  • Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing
  • 1995
TLDR
A variety of parallel priority queue data structures are investigated to implement the global event queue of the optimistic discrete event simulation to demonstrate speedups from two to four using six processors of a small bus-based shared-memory computer. Expand
Towards Symmetric Multi-threaded Optimistic Simulation Kernels
TLDR
This article addresses the reshuffle of the design of optimistic simulation kernels in order to fit multi-core/multi-processor machines by providing a reference optimistic simulation architecture based on the symmetric multi-threaded paradigm and presents a real implementation of this architecture within the ROme OpTimistic Simulator (ROOT-Sim). Expand
Efficient implementation of event sets in Time Warp
TLDR
An improved version of the skew heap that allows dequeueing of arbitrary elements at low cost and the possibility of de queues will improve memory utilization is presented, which is also important in applications where frequent rescheduling may occur. Expand
Parameterized Time Warp (PTW): An Integrated Adaptive Solution to Optimistic PDES
TLDR
This new improved Time Warp synchronization mechanism termed Parameterized Time Warp provides an integrated adaptive solution to optimistic Parallel Discrete Event Simulation. Expand
Clustered time warp and logic simulation
TLDR
This paper presents a hybrid algorithm which makes use of Time Warp between clusters of LPs and a sequential algorithm within the cluster and develops a family of three checkpointing algorithms, each of which occupies a different point in the spectrum of possible trade-offs between memory usage and execution time. Expand
Using DVFS to optimize time Warp simulations
  • Ryan Child, P. Wilsey
  • Computer Science
  • Proceedings Title: Proceedings of the 2012 Winter Simulation Conference (WSC)
  • 2012
TLDR
This work explores the adjustment of operating frequencies of cores executing on and off the critical path to reduce rollback and power consumption, while maintaining or, in some cases, enhancing performance. Expand
Dynamic load management in the time warp operating system
TLDR
Early results of the experiments with the TWOS dynamic load management facility are described, the theory and mechanics of phase splitting and migration are covered, and performance results for several combinations of benchmark, configuration, and load management policy are presented. Expand
WARPED: a time warp simulation kernel for analysis and application development
TLDR
WARPED is a publically-available time warp simulation kernel for experimentation and application development that supports LP (logical process) clustering, various time warp algorithms and several optimizations that dynamically adjust simulation parameters. Expand
Power7: IBM's Next-Generation Server Processor
The Power7 is IBM's first eight-core processor, with each core capable of four-way simultaneous-multithreading operation. Its key architectural features include an advanced memory hierarchy withExpand
...
1
2
3
...