• Publications
  • Influence
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
TLDR
We introduce the Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark, a multi-task benchmark for evaluating the cross- lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks. Expand
  • 141
  • 40
  • PDF
Are Sixteen Heads Really Better than One?
TLDR
We make the surprising observation that even if models have been trained using multiple heads, in practice, a large percentage of attention heads can be removed at test time without significantly impacting performance. Expand
  • 196
  • 28
  • PDF
Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
TLDR
We propose an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, we aggressively optimize the inference network before performing model update. Expand
  • 120
  • 28
  • PDF
DyNet: The Dynamic Neural Network Toolkit
TLDR
We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. Expand
  • 329
  • 27
  • PDF
Controllable Invariance through Adversarial Feature Learning
TLDR
Learning meaningful representations that maintain the content necessary for a particular task while filtering away detrimental variations is a problem of great interest in machine learning. Expand
  • 127
  • 23
  • PDF
Stack-Pointer Networks for Dependency Parsing
TLDR
We introduce a novel architecture for dependency parsing: stack-pointer networks (StackPtr). Expand
  • 98
  • 23
  • PDF
Learning to Translate in Real-time with Neural Machine Translation
TLDR
We propose a neural machine translation (NMT) framework for simultaneous translation in which an agent learns to make decisions on when to translate from interaction with a pre-trained NMT environment. Expand
  • 77
  • 23
  • PDF
Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis
TLDR
We present a pointwise approach to Japanese morphological analysis that ignores structure information during learning and tagging. Expand
  • 220
  • 22
  • PDF
Stress Test Evaluation for Natural Language Inference
TLDR
In this work, we propose an evaluation methodology consisting of automatically constructed "stress tests" that allow us to examine whether systems have the ability to make real inferential decisions. Expand
  • 125
  • 22
  • PDF
Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T)
TLDR
Pseudo-code written in natural language can aid the comprehension of source code in unfamiliar programming languages. Expand
  • 149
  • 20
  • PDF
...
1
2
3
4
5
...