• Publications
  • Influence
Learned in Translation: Contextualized Word Vectors
Computer vision has benefited from initializing multiple deep layers with weights pretrained on large supervised training sets like ImageNet. Natural language processing (NLP) typically seesExpand
  • 495
  • 52
  • PDF
The Natural Language Decathlon: Multitask Learning as Question Answering
Presented on August 28, 2018 at 12:15 p.m. in the Pettit Microelectronics Research Center, Room 102 A/B.
  • 204
  • 29
  • PDF
CTRL: A Conditional Transformer Language Model for Controllable Generation
Large-scale language models show promising text generation capabilities, but users cannot easily control particular aspects of the generated text. We release CTRL, a 1.63 billion-parameterExpand
  • 135
  • 22
  • PDF
Neural Text Summarization: A Critical Evaluation
Text summarization aims at compressing long documents into a shorter form that conveys the most important parts of the original document. Despite increased interest in the community and notableExpand
  • 53
  • 18
  • PDF
Evaluating the Factual Consistency of Abstractive Text Summarization
Currently used metrics for assessing summarization algorithms do not account for whether summaries are factually consistent with source documents. We propose a weakly-supervised, model-based approachExpand
  • 35
  • 12
  • PDF
Explain Yourself! Leveraging Language Models for Commonsense Reasoning
Deep learning models perform poorly on tasks that require commonsense reasoning, which often necessitates some form of world-knowledge or reasoning over information not immediately present in theExpand
  • 60
  • 9
  • PDF
Revisiting Activation Regularization for Language RNNs
Recurrent neural networks (RNNs) serve as a fundamental building block for many sequence tasks across natural language processing. Recent research has focused on recurrent dropout techniques orExpand
  • 26
  • 7
  • PDF
BERT is Not an Interlingua and the Bias of Tokenization
Multilingual transfer learning can benefit both highand low-resource languages, but the source of these improvements is not well understood. Cananical Correlation Analysis (CCA) of the internalExpand
  • 14
  • 5
  • PDF
A Simple Language Model for Task-Oriented Dialogue
Task-oriented dialogue is often decomposed into three tasks: understanding user input, deciding actions, and generating a response. While such decomposition might suggest a dedicated model for eachExpand
  • 12
  • 5
  • PDF
Prospective survey to verify the Ottawa ankle rules
  • B. McCann
  • Medicine
  • Journal of accident & emergency medicine
  • 1 January 2000
Editor,—In their study to verify the Ottawa ankle rules Perry et al point out “the potential dangers of rigidly adhering to decision rules”.1 The study discovered that four malleolar fractures wouldExpand
  • 11
  • 1