#### Filter Results:

#### Publication Year

1991

2012

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

- Greg Faanes, Abdulla Bataineh, Duncan Roweth, Tom Court, Edwin Froese, Robert Alverson +4 others
- SC
- 2012

Higher global bandwidth requirement for many applications and lower network cost have motivated the use of the Dragonfly network topology for high performance computing systems. In this paper we present the architecture of the Cray Cascade system, a distributed memory system based on the Dragonfly [1] network topology. We describe the structure of the… (More)

A parallel sorting algorithm for sorting n elements evenly distributed over Zd = p nodes of a d-dimensional hyper-cube is presented. The average running time of the algorithm is O((n log n)/p + p log2 n). The algorithm maintains a perfect load balance in the nodes by determining the (kn/p)th elements (k = 1,.. . , (p-1)) of the final sorted list in advance.… (More)

- Dennis Abts, Abdulla Bataineh, Steve Scott, Greg Faanes, James L. Schwarzmeier, Eric Lundberg +3 others
- SC
- 2007

This paper describes the system architecture of the Cray BlackWidow scalable vector multiprocessor. The BlackWidow system is a distributed shared memory (DSM) architecture that is scalable to 32K processors, each with a 4-way dispatch scalar execution unit and an 8-pipe vector unit capable of 20.8 Gflops for 64-bit operations and 41.6 Gflops for 32-bit… (More)

In this paper, we propose logic simulation techniques using parallel and vector machines to reduce simulation time of large digital circuits. Three algorithms for logic simulation have been developed and implemented on the Cray Y-iWP supercomputer, a gen-eralpurpose shared-memory parallel machine with vector processors. The jirst algorithm is a vector… (More)

Scalable shared-memory multiprocessors provide a flexible programming model with good performance scaling. These features, however, come at the expense of additional hardware complexity to provide a consistent view of the memory hierarchy. Verifying this aspect of a multiprocessor system is nontrivial, often requiring far more time than the actual… (More)

In this paper, we present algorithms for logic and fault simulation, developed and implemented on the Cray Y-MP supercomputer, a general purpose shared-memo y parallel machine with ,vector processors. The parallel-and-vector version of the event-driven logic simulation algorithm achieves a speedup of 52 on the Cray Y-MP with 8 processors, with a maximum… (More)

- ‹
- 1
- ›