Corpus ID: 225040574

mT5: A massively multilingual pre-trained text-to-text transformer

  title={mT5: A massively multilingual pre-trained text-to-text transformer},
  author={Linting Xue and Noah Constant and A. Roberts and Mihir Kale and Rami Al-Rfou and Aditya Siddhant and A. Barua and Colin Raffel},
  • Linting Xue, Noah Constant, +5 authors Colin Raffel
  • Published 2020
  • Computer Science
  • ArXiv
  • The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We describe the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. All of the code and model… CONTINUE READING
    7 Citations

    Figures and Tables from this paper

    XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
    • PDF
    Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization
    • PDF
    ParsiNLU: A Suite of Language Understanding Challenges for Persian
    • PDF
    What Makes Good In-Context Examples for GPT-$3$?
    • Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, W. Chen
    • Computer Science
    • 2021
    • PDF


    Text-to-Text Pre-Training for Data-to-Text Tasks
    • 19
    • PDF
    Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
    • 822
    • PDF
    CamemBERT: a Tasty French Language Model
    • 120
    • PDF
    SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing
    • 625
    • Highly Influential
    • PDF
    FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding
    • 7
    • PDF
    PhoBERT: Pre-trained language models for Vietnamese
    • 25
    • PDF
    CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data
    • 45
    • PDF
    InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
    • 18
    • PDF
    FlauBERT: Unsupervised Language Model Pre-training for French
    • 53
    • PDF
    WT5?! Training Text-to-Text Models to Explain their Predictions
    • 10
    • PDF