REL: An Entity Linker Standing on the Shoulders of Giants

  title={REL: An Entity Linker Standing on the Shoulders of Giants},
  author={Johannes M. van Hulst and Faegheh Hasibi and Koen Dercksen and Krisztian Balog and Arjen P. de Vries},
  journal={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
Entity linking is a standard component in modern retrieval system that is often performed by third-party toolkits. Despite the plethora of open source options, it is difficult to find a single system that has a modular architecture where certain components may be replaced, does not depend on external sources, can easily be updated to newer Wikipedia versions, and, most important of all, has state-of-the-art performance. The REL system presented in this paper aims to fill that gap. Building on… 

Figures and Tables from this paper

Neural Entity Linking: A Survey of Models based on Deep Learning

This work distills a generic architecture of a neural EL system and discusses its components, such as candidate generation, mention-context encoding, and entity ranking, summarizing prominent methods for each of them.

REBL: Entity Linking at Scale

The experience with optimizing the Radboud Entity Linking (REL) toolkit for batch processing large corpora, which makes it easier to isolate the GPU heavy operations from the CPU heavy operations, and improves the entity disambiguation module.

EntQA: Entity Linking as Question Answering

EntQA first proposes candidate entities with a fast retrieval module, and then scrutinizes the document to find mentions of each candidate with a powerful reader module, which capitalizes on pretrained models for dense entity retrieval and reading comprehension.

Named Entity Recognition and Linking on Historical Newspapers: UvA.ILPS & REL at CLEF HIPE 2020

This paper describes the submission to the CLEF HIPE 2020 shared task on identifying named entities in multi-lingual historical newspapers in French, German and English, and uses an ensemble of fine-tuned BERT models for named entity recognition and entity linking.

Autoregressive Entity Retrieval

Entities are at the center of how we represent and aggregate knowledge. For instance, Encyclopedias such as Wikipedia are structured by entities (e.g., one per article). The ability to retrieve such

ReFinED: An Efficient Zero-shot-capable Approach to End-to-End Entity Linking

ReFinED is introduced, an efficient end-to-end entity linking model which uses fine-grained entity types and entity descriptions to perform linking and is an effective and cost-efficient system for extracting entities from web-scale datasets.

Entity Linking Meets Deep Learning: Techniques and Solutions

A new taxonomy is proposed, which organizes existing DL based EL methods using three axes: embedding, feature, and algorithm, and systematically survey the representative EL methods along the three axes of the taxonomy.

Entity-aware Transformers for Entity Search

It is observed empirically that the entity-enriched BERT models enable fine-tuning on limited training data, which otherwise would not be feasible due to the known instabilities of BERT in few-sample fine- Tuning, thereby contributing to data-efficient training of Bert for entity search.

Reddit Entity Linking Dataset

Goodwill Hunting: Analyzing and Repurposing Off-the-Shelf Named Entity Linking Systems

This work lays out and investigates two challenges faced by individuals or organizations building NEL systems, and shows how tailoring a simple technique for patching models using weak labeling can provide a 25% absolute improvement in accuracy of sport-related errors.



GERBIL - Benchmarking Named Entity Recognition and Linking consistently

GERBIL aims to become a focal point for the state of the art, driving the research agenda of the community by presenting comparable objective evaluation results, and tackles the central problem of the evaluation of entity linking by answering the question how an evaluation algorithm can compare two URIs to each other without being bound to a specific knowledge base.

Improving Entity Linking by Modeling Latent Relations between Mentions

This work treats relations as latent variables in the neural entity-linking model so that the injected structural bias helps to explain regularities in the training data and achieves the best reported scores on the standard benchmark and substantially outperforms its relation-agnostic version.

Lightweight Multilingual Entity Extraction and Linking

An accurate and lightweight, multilingual named entity recognition (NER) and linking (NEL) system that achieves state-of-the-art performance on TAC KBP 2013 multilingual data and on English AIDA CONLL data is presented.

From TagME to WAT: a new entity annotator

A novel entity annotator for texts which hinges on TagME's algorithmic technology, currently the best one available, and can be interpreted as a flexible library of several parsing/disambiguation and pruning modules that can be used to build up new and more sophisticated entity annotators.

End-to-End Neural Entity Linking

This work proposes the first neural end-to-end EL system that jointly discovers and links entities in a text document and shows that it significantly outperforms popular systems on the Gerbil platform when enough training data is available.

Exploiting Entity Linking in Queries for Entity Retrieval

A new probabilistic component is introduced and it is shown how it can be applied on top of any term-based entity retrieval model that can be emulated in the Markov Random Field framework, including language models, sequential dependence models, as well as their fielded variations.

TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)

We designed and implemented TAGME, a system that is able to efficiently and judiciously augment a plain-text with pertinent hyperlinks to Wikipedia pages. The specialty of TAGME with respect to known

Learning to link with wikipedia

This paper explains how machine learning can be used to identify significant terms within unstructured text, and enrich it with links to the appropriate Wikipedia articles, and performs very well, with recall and precision of almost 75%.

Multi-step classification approaches to cumulative citation recommendation

Two multi-step classification approaches are proposed for knowledge base acceleration systems that consist of two and three binary classification steps, respectively, and it is shown that both approaches deliver state-of-the-art performance.

Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation

A novel embedding method specifically designed for NED that jointly maps words and entities into the same continuous vector space and extends the skip-gram model by using two models.