GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

  title={GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding},
  author={Alex Wang and Amanpreet Singh and Julian Michael and Felix Hill and Omer Levy and Samuel R. Bowman},
  • Alex Wang, Amanpreet Singh, +3 authors Samuel R. Bowman
  • Published in BlackboxNLP@EMNLP 2018
  • Computer Science
  • For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset. [...] Key Method We further provide a hand-crafted diagnostic test suite that enables detailed linguistic analysis of NLU models. We evaluate baselines based on current methods for multi-task and transfer learning and find that they do not immediately…Expand Abstract
    994 Citations

    Figures, Tables, and Topics from this paper.

    KILT: a Benchmark for Knowledge Intensive Language Tasks
    • 5
    • PDF
    Language Models are Unsupervised Multitask Learners
    • 1,972
    • PDF
    Multi-task learning for natural language processing in the 2020s: where are we going?
    FlauBERT: Unsupervised Language Model Pre-training for French
    • 40
    • Highly Influenced
    • PDF
    KLEJ: Comprehensive Benchmark for Polish Language Understanding
    • 2
    • Highly Influenced
    • PDF
    Improving Language Understanding by Generative Pre-Training
    • 1,526
    • Highly Influenced
    • PDF
    Exploring and Predicting Transferability across NLP Tasks
    • 4
    • Highly Influenced
    • PDF
    Learning and Evaluating General Linguistic Intelligence
    • 70
    • PDF


    AllenNLP: A Deep Semantic Natural Language Processing Platform
    • 505
    • PDF
    One billion word benchmark for measuring progress in statistical language modeling
    • 706
    • PDF
    Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference
    • 68
    • Highly Influential
    • PDF