TabEL: Entity Linking in Web Tables

Abstract

Web tables form a valuable source of relational data. The Web contains an estimated 154 million HTML tables of relational data, with Wikipedia alone containing 1.6 million high-quality tables. Extracting the semantics of Web tables to produce machine-understandable knowledge has become an active area of research. A key step in extracting the semantics of Web content is entity linking (EL): the task of mapping a phrase in text to its referent entity in a knowledge base (KB). In this paper we present TabEL, a new EL system for Web tables. TabEL differs from previous work by weakening the assumption that the semantics of a table can be mapped to pre-defined types and relations found in the target KB. Instead, TabEL enforces soft constraints in the form of a graphical model that assigns higher likelihood to sets of entities that tend to co-occur in Wikipedia documents and tables. In experiments, TabEL significantly reduces error when compared to current state-of-the-art table EL systems, including a 75% error reduction on Wikipedia tables and a 60% error reduction on Web tables. We also make our parsed Wikipedia table corpus and test datasets publicly available for future work.

DOI: 10.1007/978-3-319-25007-6_25

Extracted Key Phrases

8 Figures and Tables

05101520162017
Citations per Year

Citation Velocity: 8

Averaging 8 citations per year over the last 2 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@inproceedings{Bhagavatula2015TabELEL, title={TabEL: Entity Linking in Web Tables}, author={Chandra Bhagavatula and Thanapon Noraset and Doug Downey}, booktitle={International Semantic Web Conference}, year={2015} }