Cache-efficient string sorting using copying

@article{Sinha2006CacheefficientSS,
  title={Cache-efficient string sorting using copying},
  author={Ranjan Sinha and Justin Zobel and David Ring},
  journal={Journal of Experimental Algorithmics (JEA)},
  year={2006},
  volume={11},
  pages={1.2 - es}
}
Burstsort is a cache-oriented sorting technique that uses a dynamic trie to efficiently divide large sets of string keys into related subsets small enough to sort in cache. In our original burstsort, string keys sharing a common prefix were managed via a bucket of pointers represented as a list or array; this approach was found to be up to twice as fast as the previous best string sorts, mostly because of a sharp reduction in out-of-cache references. In this paper, we introduce C-burstsort… 
Engineering burstsort: Toward fast in-place string sorting
TLDR
Improvements are introduced that reduce by a significant margin the memory requirement of Burstsort: It is now less than 1% greater than an in-place algorithm.
Engineering Burstsort: Towards Fast In-Place String Sorting
TLDR
Improvements that reduce by a significant margin the memory requirements of burstsort are introduced, and during the bucket-sorting phase, the string suffixes are copied to a small buffer to improve their spatial locality, lowering the running time of burstort by up to 30%.
Redesigning the string hash table, burst trie, and BST to exploit cache
TLDR
Two alternatives to the standard representation of strings are explored: the simple expedient of including the string in its node, and, for linked lists, the more drastic step of replacing each list of nodes by a contiguous array of characters.
Cache-Conscious Collision Resolution in String Hash Tables
TLDR
Two alternatives to the standard representation of string hash tables are explored: the simple expedient of including the string in its node, and the more drastic step of replacing each list of nodes by a contiguous array of characters.
Engineering scalable, cache and space efficient tries for strings
TLDR
A novel and practical solution that carefully combines a trie with a hash table, creating a variant of burst trie called HAT-trie, which is currently the leading in-memory trie-based data structure offering rapid, compact, and scalable storage and retrieval of variable-length strings.
Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools
TLDR
This dissertation focuses on two fundamental sorting problems: string sorting and suffix sorting, and proposes both multiway distribution-based with string sample sort and multiway merge-based string sorting with LCP-aware merge and mergesort, and engineer and parallelize both approaches.
Efficient parallel merge sort for fixed and variable length keys
TLDR
This work designs a high-performance parallel merge sort for highly parallel systems, and develops a scheme for sorting variable-length key/value pairs, with a special emphasis on string keys.
Can GPUs sort strings efficiently?
TLDR
This paper presents a fast and efficient string sort on the GPU that is built on the available radix sort, and achieves speed of up to 10 over current GPU methods, especially on large datasets.
String sorting on multi and many-threaded architectures: A comparative study
TLDR
A comparative study on the most popular and efficient string sorting algorithms that have been implemented on CPU and GPU machines and an efficient parallel multi-key quicksort implementation which uses ternary search tree in order to increase the speed up and efficiency of sorting large set of string data are produced.
On demand string sorting over unbounded alphabets
...
...

References

SHOWING 1-10 OF 30 REFERENCES
Using random sampling to build approximate tries for efficient string sorting
TLDR
New variants of burstsort, a new string-sorting algorithm that on large sets of strings is almost twice as fast as previous algorithms, primarily because it is more cache efficient are introduced: SR-burstsort, DR-burstort, and DRL-Burstsort.
Cache-conscious sorting of large sets of strings with dynamic tries
TLDR
This work proposes a new sorting algorithm for strings, burstsort, based on dynamic construction of a compact trie in which strings are kept in buckets, which is simple, fast, and efficient.
Burst tries: a fast, efficient data structure for string keys
TLDR
These experiments show that the burst trie is particularly effective for the skewed frequency distributions common in text collections, and dramatically outperforms all other data structures for the task of managing strings while maintaining sort order.
Analysing cache effects in distribution sorting
TLDR
An approximate analysis for distribution sorting uniform keys is presented which predicts the expected cache misses of Flashsort1 quite well and it is shown that the integer distribution sorting algorithm MSB radix sort performs well on both uniform integer and uniform floating-point keys.
CC-Radix: a cache conscious sorting based on Radix sort
TLDR
CC-Radix improves the data locality by dynamically partitioning the data set into subsets that fit in cache level L/sub 2/.
Adapting Radix Sort to the Memory Hierarchy
TLDR
The importance of reducing misses in the translation-lookaside buffer (TLB) for obtaining good performance on modern computer architectures is demonstrated and three techniques which simultaneously reduce cache and TLB misses for LSB radix sort are given: reducing working set size, explicit block transfer and pre-sorting.
Radix Sorting & Searching
TLDR
This thesis shows that string sorting can be reduced to integer sorting in optimal asymptotic time and it is shown that a unit-cost RAM with a word length of w bits can sort n word-sized integers in O(n log log n) time for arbitrary w >= log n, a significantly improved upper bound for sorting.
Fast algorithms for sorting and searching strings
TLDR
This work presents theoretical algorithms for sorting and searching multikey data, and derive from them practical C implementations for applications in which keys are character strings, and presents extensions to more complex string problems, such as partial-match searching.
Efficient sorting using registers and caches
TLDR
This paper introduces a new cache-conscious sorting algorithm, R-MERGE, which achieves better performance in practice over algorithms that are superior in the theoretical models, and quantifies the performance effects of features not reflected in the models.
Engineering Radix Sort
TLDR
Three ways to sort strings by bytes left to right-a stable list sort, a stable two-array sort, and an in-place "American flag" sor¿-are illustrated with practical C programs, and all three perform comparably, usually running at least twice as fast as a good quicksort.
...
...