Corpus ID: 211296334

Data Augmentation for Personal Knowledge Graph Population

  title={Data Augmentation for Personal Knowledge Graph Population},
  author={Lingraj S. Vannur and Lokesh Nagalapatti and Balaji Ganesan and Hima Patel},
Cold start knowledge base population (KBP) is the problem of populating a knowledge base from unstructured documents. While artificial neural networks have led to significant improvements in the different tasks that are part of KBP, the overall F1 of the end-to-end system remains quite low. This problem is more acute in personal knowledge bases, which present additional challenges with regard to data protection, fairness and privacy. In this work, we present a system that uses rule based… Expand
Document Structure aware Relational Graph Convolutional Networks for Ontology Population
The role of document structure in learning ontological relationships between concepts in any document corpus is examined and Inspired by ideas from hypernym discovery and explainability, this method performs about 15 points more accurate than a stand-alone RGCN model for this task. Expand
Reimagining GNN Explanations with ideas from Tabular Data
This work uses a task that straddles both graphs and tabular data, namely Entity Matching, to comment on key aspects of explainability that are missing in GNN model explanations. Expand
Explainable Link Prediction for Privacy-Preserving Contact Tracing
In this concept paper, ideas from Graph Neural Networks and explainability are presented that could improve trust in contract tracing applications, and encourage adoption by people. Expand
Link Prediction using Graph Neural Networks for Master Data Management
Novel methods for anonymizing data, model training, explainability and verification for Link Prediction in Master Data Management, and discuss the results are introduced. Expand


KnowledgeNet: A Benchmark Dataset for Knowledge Base Population
Five baseline approaches are discussed, where the best approach achieves an F1 score of 0.50, significantly outperforming a traditional approach by 79% and far from reaching human performance, indicating the KnowledgeNet dataset is challenging. Expand
Position-aware Attention and Supervised Data Improve Slot Filling
An effective new model is proposed, which combines an LSTM sequence model with a form of entity position-aware attention that is better suited to relation extraction that builds TACRED, a large supervised relation extraction dataset obtained via crowdsourcing and targeted towards TAC KBP relations. Expand
Collective Learning From Diverse Datasets for Entity Typing in the Wild
A Collective Learning Framework is proposed, which enables learning from diverse datasets in a unified way by aggregating label information from all available datasets by building a single neural network classifier using UHLS, label mapping, and a partial loss function. Expand
Overview of Linguistic Resources for the TAC KBP 2017 Evaluations: Methodologies and Results
This paper describes LDC's resource creation efforts and their results in support of TAC KBP 2017, an evaluation track of the Text Analysis Conference. Expand
Overview of the TAC 2010 Knowledge Base Population Track
An overview of the task definition and annotation challenges associated with KBP2010 is provided and the evaluation results and lessons that are learned are discussed based on detailed analysis. Expand
TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task
This paper first validate the most challenging 5K examples in the development and test sets using trained annotators and finds that label errors account for 8% absolute F1 test error, and that more than 50% of the examples need to be relabeled. Expand
Personal knowledge graph population from user utterances in conversational understanding
A statistical language understanding approach to automatically construct personal (user-centric) knowledge graphs in conversational dialogs to better understand the users' requests, fulfilling them, and enabling other technologies such as developing better inferences or proactive interactions is introduced. Expand
Link Prediction on N-ary Relational Data
A method to conduct Link Prediction on N-ary relational data, thus called NaLP, is proposed, which explicitly models the relatedness of all the role-value pairs in the same n-ARY relational fact. Expand
A Unified Labeling Approach by Pooling Diverse Datasets for Entity Typing
This work converts the label set of all datasets to a unified hierarchical label set while preserving the semantic properties of the individual labels, and trains a single neural network based classifier using every available dataset for the ET task. Expand
A Neural Architecture for Person Ontology population
This work presents a system for automatically populating a person ontology graph from unstructured data using neural models for Entity Classification and Relation Extraction, and introduces a new dataset for these tasks. Expand