XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
- Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, Melvin Johnson
- 24 March 2020
Computer Science, Linguistics
International Conference on Machine Learning
The Cross-lingual TRansfer Evaluation of Multilingual Encoders XTREME benchmark is introduced, a multi-task benchmark for evaluating the cross-lingually generalization capabilities of multilingual representations across 40 languages and 9 tasks.
Are Sixteen Heads Really Better than One?
- Paul Michel, Omer Levy, Graham Neubig
- 1 May 2019
Computer Science
Neural Information Processing Systems
It is made the surprising observation that even if models have been trained using multiple heads, in practice, a large percentage of attention heads can be removed at test time without significantly impacting performance.
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
- Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- 1 May 2020
Computer Science
Annual Meeting of the Association for…
TaBERT is a pretrained LM that jointly learns representations for NL sentences and (semi-)structured tables that achieves new best results on the challenging weakly-supervised semantic parsing benchmark WikiTableQuestions, while performing competitively on the text-to-SQL dataset Spider.
Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
- Junxian He, Daniel M. Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick
- 16 January 2019
Computer Science
International Conference on Learning…
This paper investigates posterior collapse from the perspective of training dynamics and proposes an extremely simple modification to VAE training to reduce inference lag: depending on the model's current mutual information between latent variable and observation, the inference network is optimized before performing each model update.
Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow
- Pengcheng Yin, Bowen Deng, Edgar Chen, Bogdan Vasilescu, Graham Neubig
- 23 May 2018
Computer Science
IEEE Working Conference on Mining Software…
A novel method to mine high-quality aligned data from SO using two sets of features: hand-crafted features considering the structure of the extracted snippets, and correspondence features obtained by training a probabilistic model to capture the correlation between NL and code using neural networks.
Weight Poisoning Attacks on Pretrained Models
- Keita Kurita, Paul Michel, Graham Neubig
- 14 April 2020
Computer Science
Annual Meeting of the Association for…
It is shown that it is possible to construct “weight poisoning” attacks where pre-trained weights are injected with vulnerabilities that expose “backdoors” after fine-tuning, enabling the attacker to manipulate the model prediction simply by injecting an arbitrary keyword.
Competence-based Curriculum Learning for Neural Machine Translation
- Emmanouil Antonios Platanios, Otilia Stretcu, Graham Neubig, B. Póczos, Tom Michael Mitchell
- 23 March 2019
Computer Science
North American Chapter of the Association for…
A curriculum learning framework for NMT that reduces training time, reduces the need for specialized heuristics or large batch sizes, and results in overall better performance, which can help improve the training time and the performance of both recurrent neural network models and Transformers.
Stress Test Evaluation for Natural Language Inference
- Aakanksha Naik, Abhilasha Ravichander, N. Sadeh, C. Rosé, Graham Neubig
- 2 June 2018
Computer Science
International Conference on Computational…
This work proposes an evaluation methodology consisting of automatically constructed “stress tests” that allow us to examine whether systems have the ability to make real inferential decisions, and reveals strengths and weaknesses of these models with respect to challenging linguistic phenomena.
Controllable Invariance through Adversarial Feature Learning
- Qizhe Xie, Zihang Dai, Yulun Du, E. Hovy, Graham Neubig
- 31 May 2017
Computer Science
NIPS
This paper shows that the proposed framework induces an invariant representation, and leads to better generalization evidenced by the improved performance on three benchmark tasks.
Stack-Pointer Networks for Dependency Parsing
- Xuezhe Ma, Zecong Hu, J. Liu, Nanyun Peng, Graham Neubig, E. Hovy
- 3 May 2018
Computer Science
Annual Meeting of the Association for…
A novel architecture for dependency parsing: stack-pointer networks (StackPtr), which first reads and encodes the whole sentence, then builds the dependency tree top-down in a depth-first fashion, yielding an efficient decoding algorithm with O(n^2) time complexity.
...
...