• Corpus ID: 239009733

Learn Proportional Derivative Controllable Latent Space from Pixels

  title={Learn Proportional Derivative Controllable Latent Space from Pixels},
  author={Weiyao Wang and Marin Kobilarov and Gregory Hager},
Recent advances in latent space dynamics model from pixels show promising progress in vision-based model predictive control (MPC). However, executing MPC in real time can be challenging due to its intensive computational cost in each timestep. We propose to introduce additional learning objectives to enforce that the learned latent space is proportional derivative controllable. In execution time, the simple PD-controller can be applied directly to the latent space encoded from pixels, to… 

Figures from this paper


Learning Latent Dynamics for Planning from Pixels
The Deep Planning Network (PlaNet) is proposed, a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space using a latent dynamics model with both deterministic and stochastic transition components.
Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images
Embed to Control is introduced, a method for model learning and control of non-linear dynamical systems from raw pixel images that is derived directly from an optimal control formulation in latent space and exhibits strong performance on a variety of complex control problems.
Predictive Coding for Locally-Linear Control
This paper proposes a novel information-theoretic LCE approach and shows theoretically that explicit next-observation prediction can be replaced with predictive coding, and uses predictive coding to develop a decoder-free LCE model whose latent dynamics are amenable to locally-linear control.
Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control
To make PCC tractable, an amortized variational bound for the PCC loss function is derived and it is demonstrated that the new variational-PCC learning algorithm benefits from significantly more stable and reproducible training, and leads to superior control performance.
Dream to Control: Learning Behaviors by Latent Imagination
Dreamer is presented, a reinforcement learning agent that solves long-horizon tasks purely by latent imagination and efficiently learn behaviors by backpropagating analytic gradients of learned state values through trajectories imagined in the compact state space of a learned world model.
PVEs: Position-Velocity Encoders for Unsupervised Learning of Structured State Representations
Position-velocity encoders (PVEs) which learn---without supervision---to encode images to positions and velocities of task-relevant objects and compute the velocity state from finite differences in position are proposed.
SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
This paper presents a method for learning representations that are suitable for iterative model-based policy improvement, even when the underlying dynamical system has complex dynamics and image observations, in that these representations are optimized for inferring simple dynamics and cost models given data from the current policy.
Deep visual foresight for planning robot motion
This work develops a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data and enables a real robot to perform nonprehensile manipulation — pushing objects — and can handle novel objects not seen during training.
Robust Locally-Linear Controllable Embedding
A new model for learning robust locally-linear controllable embedding (RCE) is presented, which directly estimates the predictive conditional density of the future observation given the current one, while introducing the bottleneck between the current and future observations.
Deep Visual MPC-Policy Learning for Navigation
PoliNet is a deep visual model predictive control-policy learning method that can perform visual navigation while avoiding collisions with unseen objects on the navigation path and validated both in a realistic simulation environment and in the real world outperforming state-of-the-art baselines.