• Publications
  • Influence
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
TLDR
Experiments show that InfoGAN learns interpretable representations that are competitive with representations learned by existing fully supervised methods.
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
TLDR
This work explores the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients, and highlights several advantages of ES as a blackbox optimization technique.
Benchmarking Deep Reinforcement Learning for Continuous Control
TLDR
This work presents a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, task with partial observations, and tasks with hierarchical structure.
PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications
TLDR
This work discusses the implementation of PixelCNNs, a recently proposed class of powerful generative models with tractable likelihood that contains a number of modifications to the original model that both simplify its structure and improve its performance.
Variational Lossy Autoencoder
TLDR
This paper presents a simple but principled method to learn global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN with greatly improve generative modeling performance of VAEs.
RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning
TLDR
This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.
A Simple Neural Attentive Meta-Learner
TLDR
This work proposes a class of simple and generic meta-learner architectures that use a novel combination of temporal convolutions and soft attention; the former to aggregate information from past experience and the latter to pinpoint specific pieces of information.
VIME: Variational Information Maximizing Exploration
TLDR
VIME is introduced, an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics which efficiently handles continuous state and action spaces and can be applied with several different underlying RL algorithms.
Equivalence Between Policy Gradients and Soft Q-Learning
TLDR
There is a precise equivalence between Q-learning and policy gradient methods in the setting of entropy-regularized reinforcement learning, and it is shown that "soft" $Q-learning is exactly equivalent to a policy gradient method.
...
1
2
3
4
5
...