• Corpus ID: 1339537

Datamation 2001: A Sorting Odyssey

@inproceedings{Popovici2001Datamation2A,
  title={Datamation 2001: A Sorting Odyssey},
  author={Florentina I. Popovici and John Bent and Brian C. Forney and Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau},
  year={2001}
}
We present our experience of turning a Linux cluster into a high-performance parallel sorting system. Our implementation, WIND-SORT, broke the Datamation record by roughly a factor of two, sorting 1 million 100-byte records in 0.48 seconds. We have identied three keys to our success: developing a fast remote execution service, conguring the cluster properly, and avoiding the potential ill-effects of occasionally faulty hardware. 

Figures from this paper

Datamation: A Quarter of a Century and Four Orders of Magnitude Later
TLDR
Of the many implementation and configuration choices the authors faced, the most crucial were judicious data placement and access patterns on disk, adoption of UDP sockets instead of MPI, careful pruning of virtually all system daemons, and rejection of ``on demand'' frequency scaling.
Sorting on a Cluster Attached to a Storage-Area Network
In November 2004, the SAN Cluster Sort program (SCS) set new records for the Indy versions of the Minute and TeraByte Sorts. SCS ran on a cluster of 40 dual-processor Itanium2 nodes on the show floor
Distribution-Insensitive Parallel External Sorting on PC Clusters
TLDR
This paper presents two distribution- insensitive parallel external sorting algorithms that use sampling technique and histogram counts to achieve even distribution of data among processors, which eventually contribute to achieve superb performance.
DMSort: A PennySort and Performance/Price Sort
TLDR
The DMSort system is discussed, which is capable of more than double the performance of previously published results when run on the authors' system configuration.
High-speed parallel external sorting of data with arbitrary distribution
TLDR
Two distribution-insensitive scalable parallel external sorting algorithms that use sampling technique and histogram counts to achieve even distribution of keys, which eventually contribute to achieve good performance are developed.
Parallel external sort of floating-point data by integer conversion
TLDR
A fast external sorting algorithm of floating point numbers with integer operations only, which shortens the computing time significantly and is introduced in this paper.
From Sand to Flour: The Next Leap in Granular Computing with NanoSort
TLDR
NanoSort, a distributed sorting algorithm running on the nanoPU, is built and it is shown that NanoSort can sort 1M keys in 68 µ s, an order of magnitude faster than MilliSort, the current state-of-the-art.

References

SHOWING 1-10 OF 13 REFERENCES
High-performance sorting on networks of workstations
We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on a NOW is competitive to sorting on the large-scale
AlphaSort: a RISC machine sort
TLDR
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and proposes two new benchmarks: Minutesort: how much can you sort in a minute, and DollarSort: how to sort for a dollar.
A measure of transaction processing power
TLDR
These benchmarks measure the performance of diverse transaction processing systems and a standard system cost measure is stated and used to define price/performance metrics.
Parallel database systems: the future of high performance database systems
TLDR
Over the last decade 'Eradata, Tandem, and a host of startup companies have successfully developed and marketed highly parallel machines that refutes a 1983 paper predicting the demise of database machines.
Parallel programming in Split-C
The authors introduce the Split-C language, a parallel extension of C intended for high performance programming on distributed memory multiprocessors, and demonstrate the use of the language in
Fail-stutter fault tolerance
TLDR
This work introduces the concept of fail-stutter fault tolerance, a realistic and yet tractable fault model that accounts for both absolute failure and a new range of performance failures common in modern components.
UDPAM: Active Messages over UDP
  • Network of Workstations Project Retreat,
  • 1996
Arpaci-Dusseau. Fail-Stutter Fault Tolerance
  • The 8th Workshop on Hot Topics in Operating Systems (HotOS-VIII)
  • 2001
Active Messages: A Mechanism for Integrated Communication and Computation
TLDR
It is shown that active messages are sufficient to implement the dynamically scheduled languages for which message driven machines were designed and latency tolerance becomes a programming/compiling concern.
Ultrastar 9LZX/18ZX Hardware/Functional Specification
  • Ultrastar 9LZX/18ZX Hardware/Functional Specification
  • 1998
...
1
2
...