Coreference Resolution without Span Representations

  title={Coreference Resolution without Span Representations},
  author={Yuval Kirstain and Ori Ram and Omer Levy},
  booktitle={Annual Meeting of the Association for Computational Linguistics},
The introduction of pretrained language models has reduced many complex task-specific NLP models to simple lightweight layers. An exception to this trend is coreference resolution, where a sophisticated task-specific model is appended to a pretrained transformer encoder. While highly effective, the model has a very large memory footprint – primarily due to dynamically-constructed span and span-pair representations – which hinders the processing of complete documents and the ability to train on… 

Tables from this paper

Scaling Within Document Coreference to Long Texts

This paper proposes an approximation to end-to-end models which scales gracefully to documents of any length and shows the resulting reduction of training and inference time compared to state-of-the-art methods with only a minimal loss in accuracy.

End-To-End Neural Coreference Resolution Revisited: A Simple Yet Effective Baseline

  • T. LaiTrung BuiDoo Soon Kim
  • Computer Science
    ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2022
This work provides evidence for the necessity of carefully justifying the complexity of existing or newly proposed models, as introducing a conceptual or practical simplification to an existing model can still yield competitive results.

Improving Span Representation for Domain-adapted Coreference Resolution

This work develops methods to improve the span representations via a retrofitting loss to incentivize span representations to satisfy a knowledge-based distance function and a scaffolding loss to guide the recovery of knowledge from the span representation.

Word-Level Coreference Resolution

This work proposes to consider coreference links between individual words rather than word spans and then reconstruct the word spans, which reduces the complexity of the coreference model to O(n^2) and allows it to consider all potential mentions without pruning any of them out.

Tracing Origins: Coref-aware Machine Reading Comprehension

This paper imitated the human’s reading process in connecting the anaphoric expressions and explicitly leverage the coreference information to enhance the word embeddings from the pre-trained model, in order to highlight thecoreference mentions that must be identified for coreference-intensive question answering in QUOREF, a relatively new dataset that is specifically designed to evaluate the core conference-related performance of a model.

Towards Natural Language Interfaces for Data Visualization: A Survey

This article conducts a comprehensive review of the existing V-NLIs and develops categorical dimensions based on a classic information visualization pipeline with the extension of a V-NLI layer.

LingMess: Linguistically Informed Multi Expert Scorers for Coreference Resolution

LingMess is presented, a linguistically motivated categorization of mention-pairs into 6 types of coreference decisions and a dedicated trainable scoring function for each category that significantly improves the accuracy of the pairwise scorer as well as of the overall coreference performance on the English Ontonotes coreference corpus and 5 additional datasets.

Longtonotes: OntoNotes with Longer Coreference Chains

This work builds a corpus of coreference-annotated documents of significantly longer length than what is currently available by providing an accurate, manually-curated, merging of annotations from documents that were split into multiple parts in the original Ontonotes annotation process.

F-coref: Fast, Accurate and Easy to Use Coreference Resolution

This work introduces fastcoref, a python package for fast, accurate, and easy-to-use English coreference resolution, which allows two modes: an accurate mode based on the LingMess architecture, providing state-of-the-art coreference accuracy, and a substantially faster model, F- coref, which is the focus of this work.

End-to-end Neural Coreference Resolution

This work introduces the first end-to-end coreference resolution model, trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions.

Higher-Order Coreference Resolution with Coarse-to-Fine Inference

This work introduces a fully-differentiable approximation to higher-order inference for coreference resolution that significantly improves accuracy on the English OntoNotes benchmark, while being far more computationally efficient.

Revealing the Myth of Higher-Order Inference in Coreference Resolution

This paper implements an end-to-end coreference system as well as four HOI approaches, attended antecedent, entity equalization, span clustering, and cluster merging, where the latter two are the original methods.

CorefQA: Coreference Resolution as Query-based Span Prediction

CorefQA is presented, an accurate and extensible approach for the coreference resolution task, formulated as a span prediction task, like in question answering, which provides the flexibility of retrieving mentions left out at the mention proposal stage.

Coreference Resolution with Entity Equalization

This work shows how to represent each mention in a cluster via an approximation of the sum of all mentions in the cluster in a fully differentiable end-to-end manner, thus enabling high-order inferences in the resolution process.

BERT for Coreference Resolution: Baselines and Analysis

A qualitative analysis of model predictions indicates that, compared to ELMo and Bert-base, BERT-large is particularly better at distinguishing between related but distinct entities, but that there is still room for improvement in modeling document-level context, conversations, and mention paraphrasing.

SpanBERT: Improving Pre-training by Representing and Predicting Spans

The approach extends BERT by masking contiguous random spans, rather than random tokens, and training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it.

CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes

The OntoNotes annotation (coreference and other layers) is described and the parameters of the shared task including the format, pre-processing information, evaluation criteria, and presents and discusses the results achieved by the participating systems.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

Enhanced LSTM for Natural Language Inference

This paper presents a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset, and demonstrates that carefully designing sequential inference models based on chain LSTMs can outperform all previous models.