Author pages are created from data sourced from our academic publisher partnerships and public sources.

- Publications
- Influence

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

- Chelsea Finn, P. Abbeel, S. Levine
- Computer Science, Mathematics
- ICML
- 9 March 2017

We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning… Expand

Trust Region Policy Optimization

- John Schulman, S. Levine, P. Abbeel, Michael I. Jordan, P. Moritz
- Computer Science, Mathematics
- ICML
- 19 February 2015

In this article, we describe a method for optimizing control policies, with guaranteed monotonic improvement. By making several approximations to the theoretically-justified scheme, we develop a… Expand

Apprenticeship learning via inverse reinforcement learning

We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to… Expand

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

- Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, P. Abbeel
- Computer Science, Mathematics
- NIPS
- 12 June 2016

This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner. InfoGAN… Expand

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

- T. Haarnoja, Aurick Zhou, P. Abbeel, S. Levine
- Computer Science, Mathematics
- ICML
- 4 January 2018

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major… Expand

High-Dimensional Continuous Control Using Generalized Advantage Estimation

- John Schulman, P. Moritz, S. Levine, Michael I. Jordan, P. Abbeel
- Computer Science, Mathematics
- ICLR
- 8 June 2015

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function… Expand

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

- Ryan Lowe, Yi Wu, A. Tamar, J. Harb, P. Abbeel, Igor Mordatch
- Computer Science, Mathematics
- NIPS
- 7 June 2017

We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent… Expand

Hindsight Experience Replay

- Marcin Andrychowicz, Dwight Crow, +7 authors W. Zaremba
- Computer Science, Mathematics
- NIPS
- 5 July 2017

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning… Expand

Benchmarking Deep Reinforcement Learning for Continuous Control

- Yan Duan, Xi Chen, Rein Houthooft, John Schulman, P. Abbeel
- Computer Science, Mathematics
- ICML
- 22 April 2016

Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training… Expand

Discriminative Probabilistic Models for Relational Data

In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of… Expand