Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

@article{Kruit2020Tab2KnowBA,
  title={Tab2Know: Building a Knowledge Base from Tables in Scientific Papers},
  author={Benno Kruit and Hongyu He and Jacopo Urbani},
  journal={ArXiv},
  year={2020},
  volume={abs/2107.13306}
}
Tables in scientific papers contain a wealth of valuable knowledge for the scientific enterprise. To help the many of us who frequently consult this type of knowledge, we present Tab2Know, a new end-to-end system to build a Knowledge Base (KB) from tables in scientific papers. Tab2Know addresses the challenge of automatically interpreting the tables in papers and of disambiguating the entities that they contain. To solve these problems, we propose a pipeline that employs both statistical-based… 

Materializing Knowledge Bases via Trigger Graphs

An extensive theoretical and empirical study seeks to answer when and how TGs can be computed and what are the benefits of TGs when applied over real-world KBs and introduces algorithms that compute (minimal) TGs.

Data augmentation on graphs for table type classification

This work addresses the classification of tables using a Graph Neural Network, exploiting the table structure for the message passing algorithm in use, and proposes data augmentation techniques directly on the table graph structures.

References

SHOWING 1-10 OF 40 REFERENCES

TabEL: Entity Linking in Web Tables

TabEL differs from previous work by weakening the assumption that the semantics of a table can be mapped to pre-defined types and relations found in the target KB, and enforces soft constraints in the form of a graphical model that assigns higher likelihood to sets of entities that tend to co-occur in Wikipedia documents and tables.

Profiling the Potential of Web Tables for Augmenting Cross-domain Knowledge Bases

This paper matches a large, publicly available Web table corpus to the DBpedia knowledge base and empirically examines the Local Closed World Assumption to determine the maximal number of correct facts that an ideal data fusion strategy could generate and concludes that knowledge-based trust outperforms PageRank- and voting-based fusion.

Annotating and searching web tables using entities, types and relationships

This paper proposes new machine learning techniques to annotate table cells with entities that they likely mention, table columns with types from which entities are drawn for cells in the column, and relations that pairs of table columns seek to express, and a new graphical model for making all these labeling decisions for each table simultaneously.

Finding related tables

This work considers the problem of finding related tables in a large corpus of heterogenous tables and proposes a framework that captures several types of relatedness, including tables that are candidates for joins and tables that is candidates for union.

TableNet: An Approach for Determining Fine-grained Relations for Wikipedia Tables

Based on an extensive evaluation with more than 3.2M tables, it is shown that TableNet retains more than 88% of relevant tables pairs, and assigns table relations with an accuracy of 90%.

Extracting Novel Facts from Tables for Knowledge Graph Completion (Extended version)

The proposed end-to-end method for extending a Knowledge Graph from tables has a higher recall during the interpretation process than the state-of-the-art, and is more resistant against the bias observed in extracting mostly redundant facts since it produces more novel extractions.

Matching HTML Tables to DBpedia

This paper presents the T2D gold standard for measuring and comparing the performance of HTML table to knowledge base matching systems, and shows that T2K Match discovers table-to-class correspondences with a precision of 94%, row/columns and entities/schema elements of the knowledge base need to be found.

Effective and efficient Semantic Table Interpretation using TableMiner+

This article introduces TableMiner+, a Semantic Table Interpretation method that annotates Web tables in a both effective and efficient way and significantly reduces computational overheads in terms of wall-clock time when compared against classic methods that ‘exhaustively’ process the entire table content to build features for inference.

ColNet: Embedding the Semantics of Web Tables for Column Type Prediction

A neu-ral network based column type annotation framework named ColNet is proposed which is able to integrate KB reasoning and lookup with machine learning and can automatically train Convolu-tional Neural Networks for prediction.

Experimental Evidence Extraction System in Data Science with Hybrid Table Features and Ensemble Learning

This work builds an experimental evidence extraction system to automate the integration of tables (in the paper PDFs) into a database of experimental results, and proposes hybrid features including structural and semantic table features as well as an ensemble learning approach for column/row name classification and table unification.