Corpus ID: 215416298

Evaluating Machines by their Real-World Language Use

  title={Evaluating Machines by their Real-World Language Use},
  author={Rowan Zellers and Ari Holtzman and Elizabeth Clark and Lianhui Qin and Ali Farhadi and Yejin Choi},
  • Rowan Zellers, Ari Holtzman, +3 authors Yejin Choi
  • Published 2020
  • Computer Science
  • ArXiv
  • There is a fundamental gap between how humans understand and use language -- in open-ended, real-world situations -- and today's NLP benchmarks for language understanding. To narrow this gap, we propose to evaluate machines by their success at real-world language use -- which greatly expands the scope of language tasks that can be measured and studied. We introduce TuringAdvice, a new challenge for language understanding systems. Given a complex situation faced by a real person, a machine must… CONTINUE READING
    Experience Grounds Language
    • 16
    • PDF
    Evaluation of Text Generation: A Survey
    • 4
    • PDF
    Measuring Massive Multitask Language Understanding
    • 2
    • PDF
    Help! Need Advice on Identifying Advice
    Forecasting AI Progress: A Research Agenda


    Publications referenced by this paper.
    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    • 10,032
    • PDF
    Attention is All you Need
    • 11,724
    • PDF
    Building a Large Annotated Corpus of English: The Penn Treebank
    • 7,389
    • PDF
    Improving Language Understanding by Generative Pre-Training
    • 1,415
    • PDF
    Computing Machinery and Intelligence.
    • 2,186
    • PDF
    Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
    • 489
    • Highly Influential
    • PDF