• Publications
  • Influence
Deep contextualized word representations
TLDR
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Expand
  • 4,592
  • 943
  • PDF
RoBERTa: A Robustly Optimized BERT Pretraining Approach
TLDR
We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. Expand
  • 1,753
  • 556
  • PDF
Knowledge-Based Weak Supervision for Information Extraction of Overlapping Relations
TLDR
We present MULTIR, a novel approach for multi-instance learning with overlapping relations that combines a sentence-level extraction model with a simple, corpus-level component for aggregating the individual facts. Expand
  • 709
  • 118
  • PDF
Unsupervised Cross-lingual Representation Learning at Scale
TLDR
We present XLM-R a transformer-based multilingual masked language model pre-trained on one hundred languages, which obtains state-of-the-art performance on cross-lingual classification, sequence labeling and question answering. Expand
  • 279
  • 111
  • PDF
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
TLDR
We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. Expand
  • 486
  • 106
  • PDF
End-to-end Neural Coreference Resolution
TLDR
We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector. Expand
  • 354
  • 80
  • PDF
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
TLDR
We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. Expand
  • 316
  • 79
  • PDF
AllenNLP: A Deep Semantic Natural Language Processing Platform
TLDR
This paper describes AllenNLP, a platform for research on deep learning methods in natural language understanding. Expand
  • 505
  • 75
  • PDF
Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars
TLDR
This paper addresses the problem of mapping natural language sentences to lambda–calculus encodings of their meaning. Expand
  • 750
  • 60
  • PDF
SpanBERT: Improving Pre-training by Representing and Predicting Spans
TLDR
We present SpanBERT, a pre-training method that is designed to better represent and predict spans of text. Expand
  • 258
  • 60
  • PDF