RoBERTa: A Robustly Optimized BERT Pretraining Approach
- Yinhan Liu, Myle Ott, Veselin Stoyanov
- Computer ScienceArXiv
- 26 July 2019
It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
- M. Lewis, Yinhan Liu, Luke Zettlemoyer
- Computer ScienceAnnual Meeting of the Association for…
- 29 October 2019
BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.
Hierarchical Neural Story Generation
- Angela Fan, M. Lewis, Y. Dauphin
- Computer ScienceAnnual Meeting of the Association for…
- 1 May 2018
This work collects a large dataset of 300K human-written stories paired with writing prompts from an online forum that enables hierarchical story generation, where the model first generates a premise, and then transforms it into a passage of text.
Multilingual Denoising Pre-training for Neural Machine Translation
- Yinhan Liu, Jiatao Gu, Luke Zettlemoyer
- Computer ScienceTransactions of the Association for Computational…
- 22 January 2020
Abstract This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART—a…
End-to-end Neural Coreference Resolution
- Kenton Lee, Luheng He, M. Lewis, Luke Zettlemoyer
- Computer ScienceConference on Empirical Methods in Natural…
- 21 July 2017
This work introduces the first end-to-end coreference resolution model, trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- Patrick Lewis, Ethan Perez, Douwe Kiela
- Computer ScienceNeural Information Processing Systems
- 22 May 2020
A general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation, and finds that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
Deep Semantic Role Labeling: What Works and What’s Next
- Luheng He, Kenton Lee, M. Lewis, Luke Zettlemoyer
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2017
We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations. We use…
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
- Alex Wang, Kyunghyun Cho, M. Lewis
- Computer ScienceAnnual Meeting of the Association for…
- 8 April 2020
QAGS (pronounced “kags”), an automatic evaluation protocol that is designed to identify factual inconsistencies in a generated summary, is proposed and is believed to be a promising tool in automatically generating usable and factually consistent text.
Deal or No Deal? End-to-End Learning of Negotiation Dialogues
- M. Lewis, Denis Yarats, Y. Dauphin, Devi Parikh, Dhruv Batra
- Computer ScienceConference on Empirical Methods in Natural…
- 1 June 2017
For the first time, it is shown it is possible to train end-to-end models for negotiation, which must learn both linguistic and reasoning skills with no annotated dialogue states, and this technique dramatically improves performance.
Generalization through Memorization: Nearest Neighbor Language Models
- Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, M. Lewis
- Computer ScienceInternational Conference on Learning…
- 1 November 2019
It is suggested that learning similarity between sequences of text is easier than predicting the next word, and that nearest neighbor search is an effective approach for language modeling in the long tail.
...
...