In Search of Robust Measures of Generalization
- G. Dziugaite, Alexandre Drouin, Daniel M. Roy
- Computer ScienceNeural Information Processing Systems
- 22 October 2020
This work addresses the question of how to evaluate generalization bounds empirically and argues that generalization measures should instead be evaluated within the framework of distributional robustness.
Pretraining Representations for Data-Efficient Reinforcement Learning
- Max Schwarzer, Nitarshan Rajkumar, Aaron C. Courville
- Computer ScienceNeural Information Processing Systems
- 9 June 2021
This work uses unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data, and employs a combination of latent dynamics modelling and unsupervised goal-conditioned RL to encourage learning representations which capture diverse aspects of the underlying MDP.
Evaluating the Text-to-SQL Capabilities of Large Language Models
- Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau
- Computer ScienceArXiv
- 15 March 2022
It is demonstrated on the GeoQuery and Scholar benchmarks that a small number of in-domain examples provided in the prompt enables Codex to perform better than state-of-the-art models finetuned on such few-shot examples.
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
- Shoaib Ahmed Siddiqui, Nitarshan Rajkumar, Tegan Maharaj, David Krueger, Sara Hooker
- Computer ScienceArXiv
- 20 September 2022
This work focuses on providing a unified and efficient framework for Metadata Archaeology – uncovering and inferring metadata of examples in a dataset and is on par with far more sophisticated mitigation methods across different tasks.
Myriad: a real-world testbed to bridge trajectory optimization and deep learning
- Nikolaus H. R. Howe, Simon Dufort-Labb'e, Nitarshan Rajkumar, Pierre-Luc Bacon
- Computer ScienceArXiv
- 22 February 2022
We present Myriad, a testbed written in JAX which enables machine learning researchers to benchmark imitation learning and reinforcement learning algorithms against trajectory optimization-based…