• Corpus ID: 219721288

Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network

@article{Jiang2020CanonicalizingOK,
  title={Canonicalizing Open Knowledge Bases with Multi-Layered Meta-Graph Neural Network},
  author={Tianwen Jiang and Tong Zhao and Bing Qin and Ting Liu and N. Chawla and Meng Jiang},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.09610}
}
Noun phrases and relational phrases in Open Knowledge Bases are often not canonical, leading to redundant and ambiguous facts. In this work, we integrate structural information (from which tuple, which sentence) and semantic information (semantic similarity) to do the canonicalization. We represent the two types of information as a multi-layered graph: the structural information forms the links across the sentence, relational phrase, and noun phrase layers; the semantic information forms… 

Figures and Tables from this paper

Generating Domain-Specific Knowledge Graphs: Challenges with Open Information Extraction

This paper describes the combined use and adaption of existing open information extraction methods to build an art-historic KG that can facilitate data exploration for domain experts and presents detailed error analysis to identify the limitations of existing methods when working with domain-specific corpora.

References

SHOWING 1-10 OF 24 REFERENCES

CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information

Canonicalization using Embeddings and Side Information (CESI) is proposed -- a novel approach which performs canonicalization over learned embeddings of Open KBs by incorporating relevant NP and relation phrase side information in a principled manner.

Neural Cross-Lingual Entity Linking

This paper proposes a neural EL model that trains fine-grained similarities and dissimilarities between the query and candidate document from multiple perspectives, combined with convolution and tensor networks and shows that this English-trained system can be applied, in zero-shot learning, to other languages by making surprisingly effective use of multi-lingual embeddings.

Leveraging Linguistic Structure For Open Domain Information Extraction

This work replaces this large pattern set with a few patterns for canonically structured sentences, and shifts the focus to a classifier which learns to extract self-contained clauses from longer sentences to determine the maximally specific arguments for each candidate triple.

DeepType: Multilingual Entity Linking by Neural Type System Evolution

DeepType is applied to the problem of Entity Linking on three standard datasets and is found that it outperforms all existing solutions by a wide margin, including approaches that rely on a human-designed type system or recent deep learning-based entity embeddings.

Canonicalizing Open Knowledge Bases

This paper presents an approach based on machine learning methods that can canonicalize such Open IE triples, by clustering synonymous names and phrases, thus shedding light on the middle ground between "open" and "closed" information extraction systems.

Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking

New methods using real and complex bilinear mappings for integrating hierarchical information are presented, yielding substantial improvement over flat predictions in entity linking and fine-grained entity typing, and achieving new state-of-the-art results for end-to-end models on the benchmark FIGER dataset.

Inductive Representation Learning on Large Graphs

GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

Supervised Open Information Extraction

A novel formulation of Open IE as a sequence tagging problem, addressing challenges such as encoding multiple extractions for a predicate, and a supervised model that outperforms the existing state-of-the-art Open IE systems on benchmark datasets.

Distributed Representations of Words and Phrases and their Compositionality

This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

Knowledge Graph Identification

This paper shows how uncertain extractions about entities and their relations can be transformed into a knowledge graph and shows that compared to existing methods, the proposed approach is able to achieve improved AUC and F1 with significantly lower running time.