Collective Entity Resolution in Familial Networks


Entity resolution in settings with rich relational structure often introduces complex dependencies between coreferences. Exploiting these dependencies is challenging – it requires seamlessly combining statistical, relational, and logical dependencies. One task of particular interest is entity resolution in familial networks. In this setting, multiple partial representations of a family tree are provided, from the perspective of different family members, and the challenge is to reconstruct a family tree from these multiple, noisy, partial views. This reconstruction is crucial for applications such as understanding genetic inheritance, tracking disease contagion, and performing census surveys. Here, we design a model that incorporates statistical signals, such as name similarity, relational information, such as sibling overlap, and logical constraints, such as transitivity and bijective matching, in a collective model. We show how to integrate these features using probabilistic soft logic, a scalable probabilistic programming framework. In experiments on realworld data, our model significantly outperforms state-of-theart classifiers that use relational features but are incapable of collective reasoning.

5 Figures and Tables

Cite this paper

@inproceedings{Kouki2017CollectiveER, title={Collective Entity Resolution in Familial Networks}, author={Pigi Kouki and Jay Pujara and Christopher Steven Marcum and Laura M. Koehly and Lise Getoor}, year={2017} }