SpanBERT: Improving Pre-training by Representing and Predicting Spans

@article{Joshi2020SpanBERTIP,
  title={SpanBERT: Improving Pre-training by Representing and Predicting Spans},
  author={Mandar Joshi and Danqi Chen and Yinhan Liu and Daniel S. Weld and Luke Zettlemoyer and Omer Levy},
  journal={Transactions of the Association for Computational Linguistics},
  year={2020},
  volume={8},
  pages={64-77}
}
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Our approach extends BERT by (1) masking contiguous random spans, rather than random tokens, and (2) training the span boundary representations to predict the entire content of the masked span, without relying on the individual token representations within it. SpanBERT consistently outperforms BERT and our better-tuned baselines, with substantial gains on span selection tasks such as… Expand
Few-Shot Question Answering by Pretraining Span Selection
Studying Strategically: Learning to Mask for Closed-book QA
A Cross-Task Analysis of Text Span Representations
CorefQA: Coreference Resolution as Query-based Span Prediction
Coreference Resolution without Span Representations
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment
On Losses for Modern Language Models
Attending to Entities for Better Text Understanding
KgPLM: Knowledge-guided Language Model Pre-training via Generative and Discriminative Learning
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 67 REFERENCES
End-to-end Neural Coreference Resolution
Learning Recurrent Span Representations for Extractive Question Answering
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT for Coreference Resolution: Baselines and Analysis
ERNIE: Enhanced Language Representation with Informative Entities
Unified Language Model Pre-training for Natural Language Understanding and Generation
CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
...
1
2
3
4
5
...