Two birds with one stone: An efficient hierarchical framework for top-k and threshold-based string similarity search

@article{Wang2015TwoBW,
  title={Two birds with one stone: An efficient hierarchical framework for top-k and threshold-based string similarity search},
  author={Zhongjing Wang and Guoliang Li and Dong Deng and Yong Zhang and Jianhua Feng},
  journal={2015 IEEE 31st International Conference on Data Engineering},
  year={2015},
  pages={519-530}
}
String similarity search is a fundamental operation in data cleaning and integration. It has two variants, threshold-based string similarity search and top-k string similarity search. Existing algorithms are efficient either for the former or the latter; most of them can't support both two variants. To address this limitation, we propose a unified framework. We first recursively partition strings into disjoint segments and build a hierarchical segment tree index (HS-Tree) on top of the segments… CONTINUE READING
Highly Cited
This paper has 21 citations. REVIEW CITATIONS

References

Publications referenced by this paper.
Showing 1-10 of 27 references

Similar Papers

Loading similar papers…