Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Convolutional Sequence to Sequence Learning
- Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann Dauphin
- Computer ScienceICML
- 8 May 2017
This work introduces an architecture based entirely on convolutional neural networks, which outperform the accuracy of the deep LSTM setup of Wu et al. (2016) on both WMT'14 English-German and WMT-French translation at an order of magnitude faster speed, both on GPU and CPU.
Language Modeling with Gated Convolutional Networks
A finite context approach through stacked convolutions, which can be more efficient since they allow parallelization over sequential tokens, is developed and is the first time a non-recurrent approach is competitive with strong recurrent models on these large scale language tasks.
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks and supports distributed training across multiple GPUs and machines.
Understanding Back-Translation at Scale
This work broadens the understanding of back-translation and investigates a number of methods to generate synthetic source sentences, finding that in all but resource poor settings back-translations obtained via sampling or noised beam outputs are most effective.
3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training
- Dario Pavllo, Christoph Feichtenhofer, David Grangier, Michael Auli
- Computer ScienceIEEE/CVF Conference on Computer Vision and…
- 28 November 2018
In this work, we demonstrate that 3D poses in video can be effectively estimated with a fully convolutional model based on dilated temporal convolutions over 2D keypoints. We also introduce…
Neural Text Generation from Structured Data with Application to the Biography Domain
A neural model for concept-to-text generation that scales to large, rich domains and significantly out-performs a classical Kneser-Ney language model adapted to this task by nearly 15 BLEU is introduced.
Scaling Neural Machine Translation
This paper shows that reduced precision and large batch training can speedup training by nearly 5x on a single 8-GPU machine with careful tuning and implementation.
Label Embedding Trees for Large Multi-Class Tasks
An algorithm for learning a tree-structure of classifiers which, by optimizing the overall tree loss, provides superior accuracy to existing tree labeling methods and a method that learns to embed labels in a low dimensional space that is faster than non-embedding approaches and has superior accuracyto existing embedding approaches are proposed.
A Convolutional Encoder Model for Neural Machine Translation
A faster and simpler architecture based on a succession of convolutional layers that allows to encode the source sentence simultaneously compared to recurrent networks for which computation is constrained by temporal dependencies is presented.
Efficient softmax approximation for GPUs
- Edouard Grave, Armand Joulin, Moustapha Cissé, David Grangier, H. Jégou
- Computer ScienceICML
- 14 September 2016
This work proposes an approximate strategy to efficiently train neural network based language models over very large vocabularies by exploiting the unbalanced word distribution to form clusters that explicitly minimize the expectation of computational complexity.