AlphaSort: a RISC machine sort
- C. Nyberg, Tom Barclay, Z. Cvetanovic, J. Gray, D. Lomet
- Computer ScienceACM SIGMOD Conference
- 24 May 1994
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and proposes two new benchmarks: Minutesort: how much can you sort in a minute, and DollarSort: how to sort for a dollar.
Extending OpenMP For NUMA Machines
- John Bircsak, Peter Craig, Carl D. Offner
- Computer ScienceInternational Conference on Software Composition
- 1 August 2000
Extensions to OpenMP Fortran that implemen data placemen features needed for NUMA architectures are described and some of the techniques that the Compaq Fortran compiler uses to generate efficient code based on these extensions are described.
Characterization of Alpha AXP performance using TP and SPEC workloads
- Z. Cvetanovic, D. Bhandarkar
- Computer ScienceProceedings of 21 International Symposium on…
- 18 April 1994
A simple model for evaluating the effects of various design tradeoffs based on the data collected by using hardware monitors is proposed and indicates that Alpha AXP takes advantage of lower cycles per instruction and cycle time to achieve a significant performance advantage.
Alphasort: A cache-sensitive parallel external sort
- C. Nyberg, Tom Barclay, Z. Cvetanovic, J. Gray, D. Lomet
- Computer ScienceThe VLDB journal
- 1 October 1995
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and argues that modern architectures require algorithm designers to re-examine their use of the memory hierarchy.
Performance analysis of the Alpha 21264-based Compaq ES40 system
- Z. Cvetanovic, R. Kessler
- Computer ScienceProceedings of 27th International Symposium on…
- 1 May 2000
It is found that the Compaq ES40 often provides 2 to 3 times the performance of the AlphaServer 4100 at similar clock frequencies, and the ES40 memory system has about five times the memory bandwidth of the 4100.
Performance analysis of the Alpha 21364-based HP GS1280 multiprocessor
- Z. Cvetanovic
- Computer Science30th Annual International Symposium on Computer…
- 9 June 2003
It is found that the HP GS1280 often provides 2 to 3 times the performance of the AlphaServer GS320 at similar clock frequencies and the key reasons are advances in memory, interprocessor, and I/O subsystem designs.
Performance characterization of the Alpha 21164 microprocessor using TP and SPEC workloads
- Z. Cvetanovic, D. Bhandarkar
- Computer ScienceProceedings. Second International Symposium on…
- 3 February 1996
The AlphaServer 8200 provides 2 to 3 times the performance of the DEC 7000 server based on the faster clock, larger on-chip cache, expanded multiple-issuing, and lower cache/memory latencies and higher bandwidth.
The Effects of Problem Partitioning, Allocation, and Granularity on the Performance of Multiple-Processor Systems
- Z. Cvetanovic
- Computer ScienceIEEE transactions on computers
- 1 April 1987
The results indicate that for algorithms where both the computation and the communication overhead can be fully decomposed among N processors, the speedup is a nondecreasing function of the level of granularity for arbitrary interconnection structure and allocation of subproblems to processors.
Performance Analysis of the FFT Algorithm on a Shared-Memory Parallel Architecture
- Z. Cvetanovic
- Computer ScienceIBM Journal of Research and Development
- 1 July 1987
The results indicate that the communication delay is significantly affected by the method applied to allocate data to memory modules, and the communication time complexity is increased to O(log N) since all N requests generated by processors are serialized at a single memory module.
Efficient decomposition and performance of parallel PDE, FFT, Monte Carlo simulations, simplex, and Sparse solvers
- Z. Cvetanovic, E. Freedman, C. Nofsinger
- Computer ScienceProceedings SUPERCOMPUTING '90
- 12 November 1990
In this paper, we describe the decomposition of six algorithms: two partial differential equations (PDE) solvers (successive over-relaxation [SOR] and alternating direction implicit [ADI]), fast…
...
...