Selectivity Estimation for Fuzzy String Predicates in Large Data Sets

@inproceedings{Jin2005SelectivityEF,
  title={Selectivity Estimation for Fuzzy String Predicates in Large Data Sets},
  author={Liang Jin and Chen Li},
  booktitle={VLDB},
  year={2005}
}
Many database applications have the emerging need to support fuzzy queries that ask for strings that are similar to a given string, such as “name similar to smith” and “telephone number similar to 412-0964.” Query optimization needs the selectivity of such a fuzzy predicate, i.e., the fraction of records in the database that satisfy the condition. In this paper, we study the problem of estimating selectivities of fuzzy string predicates. We develop a novel technique, called Sepia, to solve the… CONTINUE READING
Highly Cited
This paper has 50 citations. REVIEW CITATIONS

Citations

Publications citing this paper.
Showing 1-10 of 39 extracted citations

fewer than 50 Citations

0510'07'10'13'16'19
Citations per Year
Semantic Scholar estimates that this publication has 50 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 35 references

A probabilistic approach to metasearching with adaptive probing

Proceedings. 20th International Conference on Data Engineering • 2004
View 1 Excerpt

Selectivity estimation for string predicates: overcoming the underestimation problem

Proceedings. 20th International Conference on Data Engineering • 2004
View 2 Excerpts

Efficient record linkage in large data sets

Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings. • 2003
View 2 Excerpts

Approximate Text Joins and Their Integration into an RDBMS

L. Gravano, P. Ipeirotis, N. Koudas, D. Srivastava
Proceedings of WWW, • 2002
View 1 Excerpt

Similar Papers

Loading similar papers…