BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

@article{Lewis2020BARTDS,
  title={BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension},
  author={M. Lewis and Yinhan Liu and Naman Goyal and Marjan Ghazvininejad and Abdelrahman Mohamed and Omer Levy and Veselin Stoyanov and Luke Zettlemoyer},
  journal={ArXiv},
  year={2020},
  volume={abs/1910.13461}
}
We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and many other more recent pretraining schemes. We evaluate a… Expand
Multilingual Denoising Pre-training for Neural Machine Translation
Pre-training via Paraphrasing
CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
TURBATIONS FOR CONDITIONAL TEXT GENERATION
An Investigation of Fine-tuning Pre-trained Model for MR-to-Text Generation
  • Ting Hu, C. Meinel
  • Computer Science
  • 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA)
  • 2020
SCRIPT: Self-Critic PreTraining of Transformers
Incorporating BERT into Parallel Sequence Decoding with Adapters
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 34 REFERENCES
MASS: Masked Sequence to Sequence Pre-training for Language Generation
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Text Summarization with Pretrained Encoders
Attention is All you Need
Pre-trained Language Model Representations for Language Generation
Language Models are Unsupervised Multitask Learners
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
...
1
2
3
4
...