• Corpus ID: 16327368

Fast algorithms for sorting and searching strings

@inproceedings{Bentley1997FastAF,
  title={Fast algorithms for sorting and searching strings},
  author={Jon Louis Bentley and Robert Sedgewick},
  booktitle={SODA '97},
  year={1997}
}
We present theoretical algorithms for sorting and searching multikey data, and derive from them practical C implementations for applications in which keys are character strings. The sorting algorithm blends Quicksort and radix sort; it is competitive with the best known C sort codes. The searching algorithm blends tries and binary search trees; it is faster than hashing and other commonly used search methods. The basic ideas behind the algorithms date back at least to the 1960s, but their… 

Figures and Tables from this paper

Find and Place Sorting Technique for Unique Numbers
TLDR
A new sorting algorithm named as “Find and Place Sorting Technique for Unique Numbers (FPSTUN)” is presented, designed to perform sorting quickly and easily and also efficient as existing algorithms in sorting.
Advanced Optimization of Fundamental Searching and Sorting Algorithms
TLDR
A new searching algorithms named as “OBSwQS” is designed to perform searching quickly and more effectively as compared to the existing version of searching algorithm in which there are used the concepts of Binary Search and Quick Sort algorithms.
Radix Plus Length Based Insert Sort
  • Yongcheng Zhang
  • Computer Science
    Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008)
  • 2008
TLDR
A new improved radix sort algorithm, radix plus length based insert sort algorithm(R-LI), to sort large sets of string keys, used to sort set with length-changeable string key.
Efficient String Sorting Algorithms: Cache-aware and Cache-Oblivious
TLDR
Various algorithms that aim at minimizing the number of cache misses so that the I/O bottleneck problem can be reduced thus making it more efficient and fast are discussed.
Faster suffix sorting
Efficient Trie-Based Sorting of Large Sets of Strings
TLDR
It is shown that better choice of data structures further improves the efficiency, at a small additional cost in memory, of the burstsort algorithm.
Communication-Efficient String Sorting
TLDR
These algorithms inspect only characters that are needed to determine the sorting order and communication volume is reduced by also communicating only those characters and by communicating repetitions of the same prefixes only once.
Parallel suffix sorting
TLDR
The focus is on deriving a practical implementation that works well for typical inputs rather than achieving the best possible asymptotic running-time for artificial, worst-case inputs.
Fast And Space Efficient Trie Searches
TLDR
Three algorithms are presented, two have constant insert, search and delete cost, are faster than Hash Trees and can be searched twice as quickly cas Ternary Search Trees (TST), and the third has a lg(N) byte compare cost like a TST, but is faster.
Digital Access to Comparison-Based Tree Data Structures and Algorithms
  • S. Roura
  • Computer Science
    J. Algorithms
  • 2001
This paper presents a simple method of building tree data structures, which only requires visiting O(log N) nodes and comparing O(D) digits per search or update, where N is the number of keys and D
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Partial-Match Retrieval Algorithms
  • R. Rivest
  • Computer Science
    SIAM J. Comput.
  • 1976
TLDR
A new class of combinatorial designs (called associative block designs) provides better hash functions with a greatly reduced worst-case number of lists examined, yet with optimal average behavior maintained.
Suffix arrays: a new method for on-line string searches
TLDR
A new and conceptually simple data structure, called a suffixarray, for on-line string searches is introduced in this paper, and it is believed that suffixarrays will prove to be better in practice than suffixtrees for many applications.
An Algorithm for String Matching with a Sequence of don't Cares
Quicksort with Equal Keys
TLDR
It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best.
Engineering Radix Sort
TLDR
Three ways to sort strings by bytes left to right-a stable list sort, a stable two-array sort, and an in-place "American flag" sor¿-are illustrated with practical C programs, and all three perform comparably, usually running at least twice as fast as a good quicksort.
Self-adjusting binary search trees
TLDR
The splay tree, a self-adjusting form of binary search tree, is developed and analyzed and is found to be as efficient as balanced trees when total running time is the measure of interest.
Sorting Multisets and Vectors In-Place
TLDR
An optimal in-place algorithm to lexicographically sort an array of multidimensional vectors is obtained, by applying the multiset sorting algorithm in each coordinate by adapting heapsort for multisets.
Implementing Quicksort programs
TLDR
A detailed implementation combining the most effective improvements to Quicksort is given, along with a discussion of how to implement it in assembly language, including how to apply various code optimization techniques.
Randomized binary searching with tree structures
A more efficient method of using tree structures is proposed, which utilizes both plus and minus branches in the search path. Very significant gains result when the search key includes alphabetic
Engineering a sort function
TLDR
A new qsortfunction for a C library that chooses partitioning elements by a new sampling scheme; it partitions by a novel solution to Dijkstra's Dutch National Flag problem; and it swaps efficiently.
...
1
2
3
...