#### Filter Results:

- Full text PDF available (15)

#### Publication Year

2012

2017

- This year (4)
- Last 5 years (17)
- Last 10 years (17)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Shixiang Gu, Luca Rigazio
- ArXiv
- 2014

Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and… (More)

- Eric Jang, Shixiang Gu, Ben Poole
- ArXiv
- 2016

Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to backpropagate through samples. In this work, we present an efficient gradient estimator that replaces the non-differentiable sample from a categorical distribution… (More)

Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample complexity of modelfree algorithms, particularly when using highdimensional function approximators, tends to limit their applicability to physical… (More)

- Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih
- ArXiv
- 2015

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm. Stochastic neural networks combine the power of large parametric functions with that of graphical models, which makes it possible to learn very complex distributions. However, as backpropagation is not directly applicable to stochastic… (More)

Sequential Monte Carlo (SMC), or particle filtering, is a popular class of methods for sampling from an intractable target distribution using a sequence of simpler intermediate distributions. Like other importance sampling-based methods, performance is critically dependent on the proposal distribution: a bad proposal can lead to arbitrarily inaccurate… (More)

Model-free deep reinforcement learning (RL) methods have been successful in a wide variety of simulated domains. However, a major obstacle facing deep RL in the real world is the high sample complexity of such methods. Unbiased batch policy-gradient methods offer stable learning, but at the cost of high variance, which often requires large batches, while… (More)

- Shixiang Gu, Ethan Holly, Timothy P. Lillicrap, Sergey Levine
- ArXiv
- 2016

Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically… (More)

- Shixiang Gu, Ethan Holly, Timothy P. Lillicrap, Sergey Levine
- 2017 IEEE International Conference on Robotics…
- 2017

Reinforcement learning holds the promise of enabling autonomous robots to learn large repertoires of behavioral skills with minimal human intervention. However, robotic applications of reinforcement learning often compromise the autonomy of the learning process in favor of achieving training times that are practical for real physical systems. This typically… (More)

- Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck
- ArXiv
- 2016

Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. This paper examines, both theoretically and empirically, approaches to merging onand off-policy updates for deep… (More)