• Publications
  • Influence
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence. Expand
Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning
We present a new dataset of image caption annotations, Conceptual Captions, which contains an order of magnitude more images than the MS-COCO dataset (Lin et al., 2014) and represents a wider varietyExpand
Findings of the 2012 Workshop on Statistical Machine Translation
A large-scale manual evaluation of 103 machine translation systems submitted by 34 teams was conducted, which used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 12 evaluation metrics. Expand
Sentence Level Discourse Parsing using Syntactic and Lexical Information
Two probabilistic models that can be used to identify elementary discourse units and build sentence-level discourse parse trees are introduced and shown to be sophisticated enough to yield discourse trees at an accuracy level that matches near-human levels of performance. Expand
Findings of the 2013 Workshop on Statistical Machine Translation
We present the results of the WMT13 shared tasks, which included a translation task, a task for run-time estimation of machine translation quality, and an unofficial metrics task. This year, 143Expand
Findings of the 2014 Workshop on Statistical Machine Translation
This paper presents the results of the WMT14 shared tasks, which included a standard news translation task, a separate medical translation task, a task for run-time estimation of machine translationExpand
Automatic Question Answering: Beyond the Factoid
This paper describes and evaluates a Question Answering system that goes beyond answering factoid questions, and builds the system around a noisy-channel architecture which exploits both a language model for answers and a transformation model for answer/question terms. Expand
Unsupervised Morphology Induction Using Word Embeddings
A language agnostic, unsupervised method for inducing morphological transformations between words that relies on certain regularities manifest in highdimensional vector spaces and is capable of discovering a wide range of morphological rules. Expand
Automatic question answering using the web: Beyond the Factoid
A Question Answering (QA) system that goes beyond answering factoid questions is described and evaluated, by comparing the performance of baseline algorithms against the proposed algorithms for various modules in the QA system. Expand
Automatic Prediction of Parser Accuracy
This paper proposes a technique that automatically takes into account certain characteristics of the domains of interest, and accurately predicts parser performance on data from these new domains, and has a cheap and effective recipe for measuring the performance of a statistical parser on any given domain. Expand