Goal-Aware Prediction: Learning to Model What Matters
@article{Nair2020GoalAwarePL, title={Goal-Aware Prediction: Learning to Model What Matters}, author={Suraj Nair and Silvio Savarese and Chelsea Finn}, journal={ArXiv}, year={2020}, volume={abs/2007.07170} }
Learned dynamics models combined with both planning and policy learning algorithms have shown promise in enabling artificial agents to learn to perform many diverse tasks with limited supervision. However, one of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model (future state reconstruction), and that of the downstream planner or policy (completing a specified task). This issue is exacerbated by vision-based control…
Figures from this paper
25 Citations
V ALUE G RADIENT WEIGHTED M ODEL -B ASED R EINFORCEMENT L EARNING
- Computer Science
- 2022
The Value-Gradient weighted Model loss (VaGraM) is proposed, a novel method for value-aware model learning which improves the performance of MBRL in challenging settings, such as small model capacity and the presence of distracting state dimensions.
Value Gradient weighted Model-Based Reinforcement Learning
- Computer ScienceArXiv
- 2022
The Value-Gradient weighted Model loss (VaGraM) is proposed, a novel method for value-aware model learning which improves the performance of MBRL in challenging settings, such as small model capacity and the presence of distracting state dimensions.
Control-Aware Prediction Objectives for Autonomous Driving
- Computer ScienceArXiv
- 2022
This paper presents control- aware prediction objectives (CAPOs), to evaluate the downstream effect of predictions on control without requiring the planner be differentiable, and proposes two types of importance weights that weight the predictive likelihood.
Goal-Conditioned Reinforcement Learning: Problems and Solutions
- Computer ScienceArXiv
- 2022
An overview of the challenges and algorithms for goal-conditioned reinforcement learning is provided and how goals are represented and existing solutions are designed from different points of view are presented.
KNOW THYSELF: TRANSFERABLE VISUAL CONTROL POLICIES THROUGH ROBOT-AWARENESS
- Computer Science
- 2022
Training visual control policies from scratch on a new robot typically requires generating large amounts of robot-specific data. How might we leverage data previously collected on another robot to…
R3M: A Universal Visual Representation for Robot Manipulation
- Computer ScienceArXiv
- 2022
This work pre-train a visual representation using the Ego4D human video dataset using a combination of time-contrastive learning, video-language alignment, and an L1 penalty to encourage sparse and compact representations, resulting in R3M.
Time-optimized velocity trajectory of a bounded-input double integrator with uncertainties: a solution based on PILCO
- Computer Science
- 2022
A simulation and experiment in applying an existing model-based RL framework, PILCO, to the problem of state-to-state time-opti-mal control with bounded input in the presence of uncertainties finds Gaussian Process is employed to model dynamics, successfully reducing the effect of model biases.
VisuoSpatial Foresight for physical sequential fabric manipulation
- Computer ScienceAuton. Robots
- 2022
Results suggest that training visual dynamics models using longer, corner-based actions can improve the efficiency of fabric folding by 76% and enable a physical sequential fabric folding task that VSF could not previously perform with 90% reliability.
C-Learning: Learning to Achieve Goals via Recursive Classification
- Computer ScienceICLR
- 2021
This work lays a principled foundation for goal-conditioned RL as density estimation, providing justification for goalschooling methods used in prior work, and an off-policy variant of this algorithm allows us to predict the future state distribution of a new policy, without collecting new experience.
Comparing Reconstruction- and Contrastive-based Models for Visual Task Planning
- Computer ScienceArXiv
- 2021
This work defines relevant evaluation metrics and performs a thorough study of different loss functions for state representation learning and shows that models exploiting task priors, such as Siamese networks with a simple contrastive loss, outperform reconstruction-based representations in visual task planning.
References
SHOWING 1-10 OF 81 REFERENCES
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
- Computer ScienceCoRL
- 2019
An algorithm is developed that learns an action-conditional, predictive model of expected future observations, rewards and values from which a policy can be derived by following the gradient of the estimated value along imagined trajectories.
Learning Latent Dynamics for Planning from Pixels
- Computer ScienceICML
- 2019
The Deep Planning Network (PlaNet) is proposed, a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space using a latent dynamics model with both deterministic and stochastic transition components.
Learning to Predict Without Looking Ahead: World Models Without Forward Prediction
- Computer ScienceNeurIPS
- 2019
It is shown that the emerged world model, while not explicitly trained to predict the future, can help the agent learn key skills required to perform well in its environment.
SOLAR: Deep Structured Latent Representations for Model-Based Reinforcement Learning
- Computer ScienceArXiv
- 2018
This work focuses on learning representations with probabilistic graphical model (PGM) structure, which allows it to devise an efficient local model method that infers dynamics from real-world rollouts with the PGM as a global prior.
Unsupervised Visuomotor Control through Distributional Planning Networks
- Computer ScienceRobotics: Science and Systems
- 2019
This work aims to learn an unsupervised embedding space under which the robot can measure progress towards a goal for itself, and enables learning effective and control-centric representations that lead to more autonomous reinforcement learning algorithms.
Model-Based Reinforcement Learning for Atari
- Computer ScienceICLR
- 2020
Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models, is described and a comparison of several model architectures is presented, including a novel architecture that yields the best results in the authors' setting.
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
- Computer ScienceNeurIPS
- 2019
The algorithm, search on the replay buffer (SoRB), enables agents to solve sparse reward tasks over one hundred steps, and generalizes substantially better than standard RL algorithms.
Learning Latent State Spaces for Planning through Reward Prediction
- Computer ScienceArXiv
- 2019
This work introduces a model-based planning framework which learns a latent reward prediction model and then plans in the latent state-space, and finds that this method can successfully learn an accurate latent rewards model in the presence of the irrelevant information while existing model- based methods fail.
Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation
- Computer ScienceICLR
- 2020
A framework for subgoal generation and planning, hierarchical visual foresight (HVF), which generates subgoal images conditioned on a goal image, and uses them for planning, and observes that the method naturally identifies semantically meaningful states as subgoals.
Exploring Model-based Planning with Policy Networks
- Computer ScienceICLR
- 2020
This paper proposes a novel MBRL algorithm, model-based policy planning (POPLIN), that combines policy networks with online planning and shows that POPLIN obtains state-of-the-art performance in the MuJoCo benchmarking environments, being about 3x more sample efficient than the state of theart algorithms, such as PETS, TD3 and SAC.