• Publications
  • Influence
A Neural Attention Model for Abstractive Sentence Summarization
TLDR
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.
OpenNMT: Open-Source Toolkit for Neural Machine Translation
TLDR
The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements.
Character-Aware Neural Language Models
TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
Challenges in Data-to-Document Generation
TLDR
A new, large-scale corpus of data records paired with descriptive documents is introduced, a series of extractive evaluation methods for analyzing performance are proposed, and baseline results are obtained using current neural generation methods.
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks
TLDR
A conditional recurrent neural network (RNN) which generates a summary of an input sentence which significantly outperforms the recently proposed state-of-the-art method on the Gigaword corpus while performing competitively on the DUC-2004 shared task.
Bottom-Up Abstractive Summarization
TLDR
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.
Adversarially Regularized Autoencoders
TLDR
This work proposes a flexible method for training deep latent variable models of discrete structures based on the recently-proposed Wasserstein autoencoder (WAE), and shows that the latent representation can be trained to perform unaligned textual style transfer, giving improvements both in automatic/human evaluation compared to existing methods.
Sequence-Level Knowledge Distillation
TLDR
It is demonstrated that standard knowledge distillation applied to word-level prediction can be effective for NMT, and two novel sequence-level versions of knowledge distilling are introduced that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search.
Sequence-to-Sequence Learning as Beam-Search Optimization
TLDR
This work introduces a model and beam-search training scheme, based on the work of Daume III and Marcu (2005), that extends seq2seq to learn global sequence scores and shows that this system outperforms a highly-optimized attention-basedseq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine translation.
Compound Probabilistic Context-Free Grammars for Grammar Induction
TLDR
A formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context free grammar, which is modulated by a per-sentence continuous latent variable, which induces marginal dependencies beyond the traditional context-free assumptions.
...
1
2
3
4
5
...