Generic Entity Resolution with Data Confidences

  title={Generic Entity Resolution with Data Confidences},
  author={David Menestrina and Omar Benjelloun and Hector Garcia-Molina},
We consider the Entity Resolution (ER) problem (also known as deduplication, or merge-purge), in which records determined to represent the same realworld entity are successively located and merged. Our approach to the ER problem is generic, in the sense that the functions for comparing and merging records are viewed as black-boxes. In this context, managing numerical confidences along with the data makes the ER problem more challenging to define (e.g., how should confidences of merged records… CONTINUE READING
Highly Cited
This paper has 47 citations. REVIEW CITATIONS