CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata

  title={CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata},
  author={M. Ravi and Kuldeep Singh and Isaiah Onando Mulang and Saeedeh Shekarpour and Johannes Hoffart and Jens Lehmann},
In this paper, we propose CHOLAN, a modular approach to target end-to-end entity linking (EL) over knowledge bases. CHOLAN consists of a pipeline of two transformer-based models integrated sequentially to accomplish the EL task. The first transformer model identifies surface forms (entity mentions) in a given text. For each mention, a second transformer model is employed to classify the target entity among a predefined candidates list. The latter transformer is fed by an enriched context… 

Figures and Tables from this paper

Compositional Generalization in Multilingual Semantic Parsing over Wikidata

A method is proposed for creating a multilingual, parallel dataset of question-query pairs, grounded in Wikidata, and it is used to analyze the compositional generalization of semantic parsers in Hebrew, Kannada, Chinese, and English.

Entity Linking Meets Deep Learning: Techniques and Solutions

A new taxonomy is proposed, which organizes existing DL based EL methods using three axes: embedding, feature, and algorithm, and systematically survey the representative EL methods along the three axes of the taxonomy.

Improving Candidate Retrieval with Entity Profile Generation for Wikidata Entity Linking

This work proposes a novel candidate retrieval paradigm based on entity profiling that complements the traditional approach of using a Wikipedia anchor-text dictionary, enabling it to further design a highly effective hybrid method for candidate retrieval.

Multilingual Compositional Wikidata Questions

This work proposes a method for creating a multilingual, parallel dataset of question-query pairs, grounded in Wikidata, and introduces such a dataset called CompositionalWikidata Questions (CWQ), and utilizes this data to train and evaluate semantic parsers for Hebrew, Kannada, Chinese and English, to better understand the current strengths and weaknesses of multilingual semantic parsing.

Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as a Multi-Task Problem

An autoregressive entity linking model, that is trained with two auxiliary tasks, and learns to re-rank generated samples at inference time, which sets a new state of the art in two benchmark datasets of entity linking.

Pre-Training and Fine-Tuning with Next Sentence Prediction for Multimodal Entity Linking

This work proposed a paradigm of pre-training and fine-tuning for MEL, which outperforms other baseline models and gives the final model a strong generalization capability that performs well even on smaller amounts of data.

Joint Entity Linking with BERT — Master's Thesis — Amund Faller Råheim

This thesis aims to investigate an end-to-end approach that combines Mention Detection and Entity Disambiguation in a single BERT-based model and is able to reach near state-of-the-art performance, but is unable to reproduce previous results with a similar approach.

Highly Parallel Autoregressive Entity Linking with Discriminative Correction

This work proposes a very efficient approach that parallelizes autoregressive linking across all potential mentions and relies on a shallow and efficient decoder, and augment the generative objective with an extra discriminative component, i.e., a correction term which lets us directly optimize the generator’s ranking.

KG-ZESHEL: Knowledge Graph-Enhanced Zero-Shot Entity Linking

KG-ZESHEL is presented, a knowledge graph-enhanced zero-shot entity linking approach, which extends an existing BERT-based zero- shot entity linking Approach with mention and entity auxiliary information, and shows that the proposed approach outperforms the related Bert-based state-of-the-art entity linking models.

BLINK with Elasticsearch for Efficient Entity Linking in Business Conversations

This work presents a neural entity linking system that connects the product and organization type entities in business conversations to their corresponding Wikipedia and Wikidata entries, and leverages Elasticsearch to ensure inference efficiency when deployed in a resource limited cloud machine.



Encoding Knowledge Graph Entity Aliases in Attentive Neural Network for Wikidata Entity Linking

This approach contributes by exploiting the sufficient context from a KG as a source of background knowledge, which is then fed into the neural network, and significantly outperform an end to end approach for Wikidata entity linking.

Neural Entity Linking: A Survey of Models based on Deep Learning

This work distills a generic architecture of a neural EL system and discusses its components, such as candidate generation, mention-context encoding, and entity ranking, summarizing prominent methods for each of them.

End-to-End Neural Entity Linking

This work proposes the first neural end-to-end EL system that jointly discovers and links entities in a text document and shows that it significantly outperforms popular systems on the Gerbil platform when enough training data is available.

Neural Collective Entity Linking

This work proposes a novel neural model for collective entity linking, named as NCEL, which applies Graph Convolutional Network to integrate both local contextual features and global coherence information for entity linking.

Empirical Evaluation of Pretraining Strategies for Supervised Entity Linking

In this work, we present an entity linking model which combines a Transformer architecture with large scale pretraining from Wikipedia links. Our model achieves the state-of-the-art on two commonly

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

This study proposes an extreme simplification of the entity linking setup that works surprisingly well: simply cast it as a per token classification over the entire entity vocabulary and shows on an entity linking benchmark that this model improves the entity representations over plain BERT.

Zero-Shot Entity Linking by Reading Entity Descriptions

It is shown that strong reading comprehension models pre-trained on large unlabeled data can be used to generalize to unseen entities and proposed domain-adaptive pre-training (DAP) is proposed to address the domain shift problem associated with linking unseen entities in a new domain.

A Piggyback System for Joint Entity Mention Detection and Linking in Web Queries

This paper introduces SMAPH-2, a second-order approach that, by piggybacking on a web search engine, alleviates the noise and irregularities that characterize the language of queries and puts queries in a larger context in which it is easier to make sense of them.

Bridge Text and Knowledge by Learning Multi-Prototype Entity Mention Embedding

A novel Multi-Prototype Mention Embedding model is proposed, which learns multiple sense embeddings for each mention by jointly modeling words from textual contexts and entities derived from a knowledge base, and an efficient language model based approach to disambiguate each mention to a specific sense.

Knowledge Enhanced Contextual Word Representations

After integrating WordNet and a subset of Wikipedia into BERT, the knowledge enhanced BERT (KnowBert) demonstrates improved perplexity, ability to recall facts as measured in a probing task and downstream performance on relationship extraction, entity typing, and word sense disambiguation.