• Publications
  • Influence
Exploration by Random Network Distillation
TLDR
We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. Expand
  • 335
  • 102
  • PDF
Large-Scale Study of Curiosity-Driven Learning
TLDR
We perform a large-scale study of purely curiosity-driven learning, i.e. without any extrinsic rewards, across 54 standard benchmark environments, including the Atari game suite. Expand
  • 296
  • 56
  • PDF
Data Augmentation Generative Adversarial Networks
TLDR
We show that a Data Augmentation Generative Adversarial Network (DAGAN) augments standard vanilla classifiers well. Expand
  • 420
  • 38
  • PDF
Three Factors Influencing Minima in SGD
TLDR
We study the statistical properties of the endpoint of stochastic gradient descent (SGD) and consider its Boltzmann Gibbs equilibrium distribution under the assumption of isotropic variance. Expand
  • 192
  • 34
  • PDF
How to train your MAML
TLDR
We propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of the system. Expand
  • 186
  • 33
  • PDF
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16
  • 471
  • 32
Towards a Neural Statistician
TLDR
An efficient learner is one who reuses what they already know to tackle a new problem. Expand
  • 254
  • 31
  • PDF
Censoring Representations with an Adversary
TLDR
Learning flexible representations that minimize the capability of an adversarial critic ensures there is little information in the representation about the sensitive variable. Expand
  • 261
  • 29
  • PDF
CINIC-10 is not ImageNet or CIFAR-10
TLDR
In this brief technical report we introduce the CINIC-10 dataset as a plug-in extended alternative for CIFAR-10. Expand
  • 73
  • 25
  • PDF
Probabilistic inference for solving discrete and continuous state Markov Decision Processes
TLDR
Inference in Markov Decision Processes has recently received interest as a means to infer goals of an observed action, policy recognition, and also as a tool to compute policies. Expand
  • 462
  • 23
  • PDF
...
1
2
3
4
5
...