TMR: Evaluating NER Recall on Tough Mentions

  title={TMR: Evaluating NER Recall on Tough Mentions},
  author={Jingxuan Tu and Constantine Lignos},
We propose the Tough Mentions Recall (TMR) metrics to supplement traditional named entity recognition (NER) evaluation by examining recall on specific subsets of ”tough” mentions: unseen mentions, those whose tokens or token/type combination were not observed in training, and type-confusable mentions, token sequences with multiple entity types in the test data. We demonstrate the usefulness of these metrics by evaluating corpora of English, Spanish, and Dutch using five recent neural… 
1 Citations
SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation
It is demonstrated that despite the apparent simplicity of NER evaluation, unreported differences in the scoring procedure can result in changes to scores that are both of noticeable magnitude and are statistically significant.


Portuguese Named Entity Recognition using BERT-CRF
This work employs a pre-trained BERT with Conditional Random Fields (CRF) architecture to the NER task on the Portuguese language, combining the transfer capabilities of Bert with the structured predictions of CRF, and explores feature-based and fine-tuning training strategies for the BERT model.
Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study
This paper takes the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of the authors' proposed measures, which guides us to better design models and training methods.
TSE-NER: An Iterative Approach for Long-Tail Entity Extraction in Scientific Publications
An iterative approach for training NER and NET classifiers in scientific publications that relies on minimal human input, namely a small seed set of instances for the targeted entity type, is presented.
If You Build Your Own NER Scorer, Non-replicable Results Will Come
An attempt to replicate a named entity recognition (NER) model implemented in a popular toolkit is attempted and it is discovered that a critical barrier to doing so is the inconsistent evaluation of improper label sequences.
Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition
The goal of this task is to provide a definition of emerging and of rare entities, and based on that, also datasets for detecting these entities and to evaluate the ability of participating entries to detect and classify novel and emerging named entities in noisy text.
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
A novel neutral network architecture is introduced that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF, thus making it applicable to a wide range of sequence labeling tasks.
BERTje: A Dutch BERT Model
The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks, but a monolingual Dutch BERT model called BERTje is developed and evaluated, which consistently outperforms the equally-sized multilingual Bert model on downstream NLP tasks.
Contextual String Embeddings for Sequence Labeling
This paper proposes to leverage the internal states of a trained character language model to produce a novel type of word embedding which they refer to as contextual string embeddings, which are fundamentally model words as sequences of characters and are contextualized by their surrounding text.
Design Challenges and Misconceptions in Neural Sequence Labeling
This work reproduces twelve neural sequence labeling models, which include most of the state-of-the-art structures, and conducts a systematic model comparison on three benchmarks, to reach several practical conclusions which can be useful to practitioners.
Towards Robust Linguistic Analysis using OntoNotes
An analysis of the performance of publicly available, state-of-the-art tools on all layers and languages in the OntoNotes v5.0 corpus should set the benchmark for future development of various NLP components in syntax and semantics, and possibly encourage research towards an integrated system that makes use of the various layers jointly to improve overall performance.