• Corpus ID: 5327880

SPsort: How to Sort a Terabyte Quickly

  title={SPsort: How to Sort a Terabyte Quickly},
  author={Jim Wyllie},
In December 1998, a 488 node IBM RS/6000 SP sorted a terabyte of data (10 billion 100 byte records) in 17 minutes, 37 seconds. This is more than 2.5 times faster than the previous record for a problem of this magnitude. The SPsort program itself was custom-designed for this benchmark, but the cluster, its interconnection hardware, disk subsystem, operating system, file system, communication library, and job management software are all IBM products. The system sustained an aggregate data rate of… 

Figures from this paper

Sorting on a Cluster Attached to a Storage-Area Network
In November 2004, the SAN Cluster Sort program (SCS) set new records for the Indy versions of the Minute and TeraByte Sorts. SCS ran on a cluster of 40 dual-processor Itanium2 nodes on the show floor
A High Speed Disk-to-disk Sort on a Windows Nt Cluster Running Hpvm
We describe the porting, redesign, and tuning of a high performance disk-to-disk parallel sort on a general purpose Myrinet connected PC cluster running Windows NT. This cluster employs the high
Distribution-Insensitive Parallel External Sorting on PC Clusters
This paper presents two distribution- insensitive parallel external sorting algorithms that use sampling technique and histogram counts to achieve even distribution of data among processors, which eventually contribute to achieve superb performance.
High-speed parallel external sorting of data with arbitrary distribution
Two distribution-insensitive scalable parallel external sorting algorithms that use sampling technique and histogram counts to achieve even distribution of keys, which eventually contribute to achieve good performance are developed.
Performance of the IBM general parallel file system
The performance and scalability of IBM's General Parallel File System (GPFS) under a variety of conditions are measured to give performance recommendations for application development and as a guide to the improvement of parallel file systems.
Algorithm engineering for large data sets
Students, researchers and software developers who want to learn how the interplay of hardware, software, and state-of-the-art algorithms helps to achieve high-performance processing of massive data are taught in this book.
Dynamic i/o-aware load balancing and resource management for clusters
The empirical results have demonstrated that the feedback control mechanism can not only be leveraged to enhance the performance of load-balancing schemes, but also be applied to clusters where workload conditions exhibit dynamic behaviors.
MapReduce: Simplified Data Processing on Large Clusters
This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
STXXL: standard template library for XXL data sets
The software library STXXL is presented, an implementation of the C++ standard template library (STL) for processing huge data sets that can fit only on hard disks and it is the first I/O‐efficient algorithm library that supports the pipelining technique that can save more than half of the I/Os.
Algorithm Engineering: 4th International Workshop, WAE 2000 Saarbrücken, Germany, September 5–8, 2000 Proceedings
  • J. V. Leeuwen
  • Computer Science
    Lecture Notes in Computer Science
  • 2001
It turns out that the performance of the algorithms depends heavily on the characteristics of the respective work load, and on the real world jobs the new algorithms often outperform Graham’s strategy.


High-performance sorting on networks of workstations
We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on a NOW is competitive to sorting on the large-scale
A super scalar sort algorithm for RISC processors
New sort algorithms which eliminate almost all the compares, provide functional parallelism which can be exploited by multiple execution units, significantly reduce the number of passes through keys, and improve data locality are developed.
Performance / Price Sort and PennySort
This paper documents this and proposes that the PennySort benchmark be revised to Performance/Price sort: a simple GB/$ sort metric based on a two-pass external sort.
A measure of transaction processing power
These benchmarks measure the performance of diverse transaction processing systems and a standard system cost measure is stated and used to define price/performance metrics.
Parallel sorting on a shared-nothing architecture using probabilistic splitting
  • D. DeWitt, J. Naughton, D. Schneider
  • Computer Science
    [1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems
  • 1991
The authors consider the problem of external sorting in a shared-nothing multiprocessor with two techniques for determining ranges of sort keys: exact splitting, using a parallel version of the algorithm proposed by Iyer, Ricard, and Varman; and probabilistic splitting, which uses sampling to estimate quantiles.
MPI: The Complete Reference
MPI: The Complete Reference is an annotated manual for the latest 1.1 version of the standard that illuminates the more advanced and subtle features of MPI and covers such advanced issues in parallel computing and programming as true portability, deadlock, high-performance message passing, and libraries for distributed and parallel computing.
** NetWare is a registered trademark of Novell, Inc., in the United States and other countries
  • ** NetWare is a registered trademark of Novell, Inc., in the United States and other countries
/research.microsoft.com/barc/SortBenchmark/ Sort benchmark home page
  • /research.microsoft.com/barc/SortBenchmark/ Sort benchmark home page
AIX are trademarks of the IBM Corporation in the United States or other countries or both
Home page of Ordinal Corp., whose NSORT program holds a record for MinuteSort and performed the first reported terabyte sort