Efficient entity resolution for large heterogeneous information spaces

@inproceedings{Papadakis2011EfficientER,
  title={Efficient entity resolution for large heterogeneous information spaces},
  author={George Papadakis and Ekaterini Ioannou and Claudia Nieder{\'e}e and Peter Fankhauser},
  booktitle={WSDM},
  year={2011}
}
We have recently witnessed an enormous growth in the volume of structured and semi-structured data sets available on the Web. An important prerequisite for using and combining such data sets is the detection and merge of information that describes the same real-world entities, a task known as Entity Resolution. To make this quadratic task efficient, blocking techniques are typically employed. However, the high dynamics, loose schema binding, and heterogeneity of (semi-)structured data, impose… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 52 CITATIONS, ESTIMATED 40% COVERAGE

Sorted Neighborhood for Schema-free RDF Data

VIEW 7 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Schema-Agnostic Progressive Entity Resolution

  • 2018 IEEE 34th International Conference on Data Engineering (ICDE)
  • 2018
VIEW 19 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Decision-Making Bias in Instance Matching Model Selection

  • International Semantic Web Conference
  • 2015
VIEW 6 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Effective Instance Matching for Heterogeneous Structured Data

VIEW 11 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Meta-Blocking: Taking Entity Resolutionto the Next Level

  • IEEE Transactions on Knowledge and Data Engineering
  • 2014
VIEW 16 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

Supervised Meta-blocking

VIEW 17 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

A Blocking Framework for Entity Resolution in Highly Heterogeneous Information Spaces

  • IEEE Transactions on Knowledge and Data Engineering
  • 2013
VIEW 10 EXCERPTS
CITES BACKGROUND & METHODS
HIGHLY INFLUENCED

FILTER CITATIONS BY YEAR

2011
2019

CITATION STATISTICS

  • 19 Highly Influenced Citations

  • Averaged 6 Citations per year over the last 3 years

  • 11% Increase in citations per year in 2018 over 2017

References

Publications referenced by this paper.
SHOWING 1-6 OF 6 REFERENCES

Modeling heterogeneous data in dataspace

  • 2008 IEEE International Conference on Information Reuse and Integration
  • 2008
VIEW 9 EXCERPTS
HIGHLY INFLUENTIAL

Duplicate Record Detection: A Survey

  • IEEE Transactions on Knowledge and Data Engineering
  • 2007
VIEW 15 EXCERPTS
HIGHLY INFLUENTIAL

Adaptive Blocking: Learning to Scale Up Record Linkage

  • Sixth International Conference on Data Mining (ICDM'06)
  • 2006
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Similar Papers

Loading similar papers…