Suffix arrays with a twist

Abstract

The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that (i) how we search for the right interval boundary impacts significantly the overall search speed, (ii) a B-tree data layout easily wins over the standard one, (iii) the well-known idea of a lookup table for the prefixes of the suffixes can be refined with using compression, (iv) caching prefixes of the suffixes in a helper array can pose a(nother) practical space-time tradeoff.

Extracted Key Phrases

6 Figures and Tables

Cite this paper

@article{Kowalski2016SuffixAW, title={Suffix arrays with a twist}, author={Tomasz Marek Kowalski and Szymon Grabowski and Kimmo Fredriksson and Marcin Raniszewski}, journal={CoRR}, year={2016}, volume={abs/1607.08176} }