• Publications
  • Influence
Generative Adversarial Imitation Learning
TLDR
A new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning, is proposed and a certain instantiation of this framework draws an analogy between imitation learning and generative adversarial networks.
Score-Based Generative Modeling through Stochastic Differential Equations
TLDR
This work presents a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise, and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by Slowly removing the noise.
Generative Modeling by Estimating Gradients of the Data Distribution
TLDR
A new generative model where samples are produced via Langevin dynamics using gradients of the data distribution estimated with score matching, which allows flexible model architectures, requires no sampling during training or the use of adversarial methods, and provides a learning objective that can be used for principled model comparisons.
A DIRT-T Approach to Unsupervised Domain Adaptation
TLDR
Two novel and related models are proposed: the Virtual Adversarial Domain Adaptation (VADA) model, which combines domain adversarial training with a penalty term that punishes the violation the cluster assumption, and the Decision-boundary Iterative Refinement Training with a Teacher (DIRT-T) models, which takes the VADA model as initialization and employs natural gradient steps to further minimize the Cluster assumption violation.
Combining satellite imagery and machine learning to predict poverty
TLDR
This work shows how a convolutional neural network can be trained to identify image features that can explain up to 75% of the variation in local-level economic outcomes, and could transform efforts to track and target poverty in developing countries.
PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples
Adversarial perturbations of normal images are usually imperceptible to humans, but they can seriously confuse state-of-the-art machine learning models. What makes them so special in the eyes of
InfoVAE: Information Maximizing Variational Autoencoders
TLDR
It is shown that this model can significantly improve the quality of the variational posterior and can make effective use of the latent features regardless of the flexibility of the decoding distribution, and it is demonstrated that the models outperform competing approaches on multiple performance metrics.
MOPO: Model-based Offline Policy Optimization
TLDR
A new model-based offline RL algorithm is proposed that applies the variance of a Lipschitz-regularized model as a penalty to the reward function, and it is found that this algorithm outperforms both standard model- based RL methods and existing state-of-the-art model-free offline RL approaches on existing offline RL benchmarks, as well as two challenging continuous control tasks.
Accurate Uncertainties for Deep Learning Using Calibrated Regression
TLDR
This work proposes a simple procedure for calibrating any regression algorithm, and finds that it consistently outputs well-calibrated credible intervals while improving performance on time series forecasting and model-based reinforcement learning tasks.
InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations
TLDR
A new algorithm is proposed that can infer the latent structure of expert demonstrations in an unsupervised way, built on top of Generative Adversarial Imitation Learning, and can not only imitate complex behaviors, but also learn interpretable and meaningful representations of complex behavioral data, including visual demonstrations.
...
...