Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
- Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson
- Computer Science, LinguisticsICML
- 24 March 2020
The Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark is introduced, a multi-task benchmark for evaluating the cross-lingually generalization capabilities of multilingual representations across 40 languages and 9 tasks.
Are Sixteen Heads Really Better than One?
It is made the surprising observation that even if models have been trained using multiple heads, in practice, a large percentage of attention heads can be removed at test time without significantly impacting performance.
Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
- Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick
- Computer ScienceICLR
- 16 January 2019
This paper investigates posterior collapse from the perspective of training dynamics and proposes an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, the inference network is optimized before performing each model update.
Stack-Pointer Networks for Dependency Parsing
A novel architecture for dependency parsing: stack-pointer networks (StackPtr), which first reads and encodes the whole sentence, then builds the dependency tree top-down in a depth-first fashion, yielding an efficient decoding algorithm with O(n^2) time complexity.
Learning to Translate in Real-time with Neural Machine Translation
A neural machine translation (NMT) framework for simultaneous translation in which an agent learns to make decisions on when to translate from the interaction with a pre-trained NMT environment is proposed.
Controllable Invariance through Adversarial Feature Learning
This paper shows that the proposed framework induces an invariant representation, and leads to better generalization evidenced by the improved performance on three benchmark tasks.
Stress Test Evaluation for Natural Language Inference
- Aakanksha Naik, Abhilasha Ravichander, N. Sadeh, C. Rosé, Graham Neubig
- Computer ScienceCOLING
- 2 June 2018
This work proposes an evaluation methodology consisting of automatically constructed “stress tests” that allow us to examine whether systems have the ability to make real inferential decisions, and reveals strengths and weaknesses of these models with respect to challenging linguistic phenomena.
DyNet: The Dynamic Neural Network Toolkit
DyNet is a toolkit for implementing neural network models based on dynamic declaration of network structure that has an optimized C++ backend and lightweight graph representation and is designed to allow users to implement their models in a way that is idiomatic in their preferred programming language.
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
TaBERT is a pretrained LM that jointly learns representations for NL sentences and (semi-)structured tables that achieves new best results on the challenging weakly-supervised semantic parsing benchmark WikiTableQuestions, while performing competitively on the text-to-SQL dataset Spider.
Competence-based Curriculum Learning for Neural Machine Translation
- Emmanouil Antonios Platanios, Otilia Stretcu, Graham Neubig, B. Póczos, Tom Michael Mitchell
- Computer ScienceNAACL
- 23 March 2019
A curriculum learning framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance, which can help improve the training time and the performance of both recurrent neural network models and Transformers.