• Publications
  • Influence
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
TLDR
Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Expand
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
TLDR
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a single model, taking advantage of multilingual data to improve NMT. Expand
Leveraging Linguistic Structure For Open Domain Information Extraction
TLDR
We replace a large set of patterns with a few patterns for canonically structured sentences, and shift the focus to a classifier which learns to extract self-contained clauses from longer sentences. Expand
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
TLDR
We introduce the Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark, a multi-task benchmark for evaluating the cross- lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks. Expand
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
TLDR
The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). Expand
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
TLDR
We introduce our efforts towards building a universal neural machine translation (NMT) system capable of translating between any language pair. Expand
Massively Multilingual Neural Machine Translation
TLDR
We show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages in 116 translation directions in a single model. Expand
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
TLDR
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Expand
Leveraging Weakly Supervised Data to Improve End-to-end Speech-to-text Translation
TLDR
We show that a high quality end-to-end ST model can be trained using only weakly supervised datasets, and that synthetic data sourced from unlabeled monolingual text or speech can be used to improve performance. Expand
Bootstrapped Self Training for Knowledge Base Population
TLDR
We propose bootstrapped selftraining to capture the benefits of both systems: the precision of patterns and the generalizability of trained models. Expand
...
1
2
3
...