Active sampling for entity matching

  title={Active sampling for entity matching},
  author={Kedar Bellare and Suresh Iyengar and Aditya G. Parameswaran and Vibhor Rastogi},
In entity matching, a fundamental issue while training a classifier to label pairs of entities as either duplicates or non-duplicates is the one of selecting informative training examples. Although active learning presents an attractive solution to this problem, previous approaches minimize the misclassification rate (0-1 loss) of the classifier, which is an unsuitable metric for entity matching due to class imbalance (i.e., many more non-duplicate pairs than duplicate pairs). To address this… CONTINUE READING
Highly Cited
This paper has 106 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 54 extracted citations

Gradual Machine Learning for Entity Resolution

ArXiv • 2018
View 12 Excerpts
Highly Influenced

Enabling Quality Control for Entity Resolution: A Human and Machine Cooperation Framework

2018 IEEE 34th International Conference on Data Engineering (ICDE) • 2018
View 7 Excerpts
Highly Influenced

107 Citations

Citations per Year
Semantic Scholar estimates that this publication has 107 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-4 of 4 references

Similar Papers

Loading similar papers…