• Publications
  • Influence
A Neural Attention Model for Abstractive Sentence Summarization
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence. Expand
Dimensionality Reduction by Learning an Invariant Mapping
This work presents a method - called Dimensionality Reduction by Learning an Invariant Mapping (DrLIM) - for learning a globally coherent nonlinear function that maps the data evenly to the output manifold. Expand
Sequence Level Training with Recurrent Neural Networks
This work proposes a novel sequence level training algorithm that directly optimizes the metric used at test time, such as BLEU or ROUGE, and outperforms several strong baselines for greedy generation. Expand
Learning a similarity metric discriminatively, with application to face verification
The idea is to learn a function that maps input patterns into a target space such that the L/sub 1/ norm in the target space approximates the "semantic" distance in the input space. Expand
Memory Networks
This work describes a new class of learning models called memory networks, which reason with inference components combined with a long-term memory component; they learn how to use these jointly. Expand
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
This work argues for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering, and classify these tasks into skill sets so that researchers can identify (and then rectify) the failings of their systems. Expand
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
There is a sweet-spot, not too big and not too small, between single words and full sentences that allows the most meaningful information in a text to be effectively retained and recalled, and models which store explicit representations of long-term contexts outperform state-of-the-art neural language models at predicting semantic content words. Expand
Large-scale Simple Question Answering with Memory Networks
This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be difficult in large-scale conditions. Expand
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task. Expand
A Tutorial on Energy-Based Learning
Energy-Based Models (EBMs) capture dependencies between variables by associating a scalar energy to each configuration of the variab les. Inference consists in clamping the value of observedExpand