• Corpus ID: 220514423

Goal-Aware Prediction: Learning to Model What Matters

@article{Nair2020GoalAwarePL,
  title={Goal-Aware Prediction: Learning to Model What Matters},
  author={Suraj Nair and Silvio Savarese and Chelsea Finn},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.07170}
}
Learned dynamics models combined with both planning and policy learning algorithms have shown promise in enabling artificial agents to learn to perform many diverse tasks with limited supervision. However, one of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model (future state reconstruction), and that of the downstream planner or policy (completing a specified task). This issue is exacerbated by vision-based control… 
V ALUE G RADIENT WEIGHTED M ODEL -B ASED R EINFORCEMENT L EARNING
TLDR
The Value-Gradient weighted Model loss (VaGraM) is proposed, a novel method for value-aware model learning which improves the performance of MBRL in challenging settings, such as small model capacity and the presence of distracting state dimensions.
Value Gradient weighted Model-Based Reinforcement Learning
TLDR
The Value-Gradient weighted Model loss (VaGraM) is proposed, a novel method for value-aware model learning which improves the performance of MBRL in challenging settings, such as small model capacity and the presence of distracting state dimensions.
Control-Aware Prediction Objectives for Autonomous Driving
TLDR
This paper presents control- aware prediction objectives (CAPOs), to evaluate the downstream effect of predictions on control without requiring the planner be differentiable, and proposes two types of importance weights that weight the predictive likelihood.
Goal-Conditioned Reinforcement Learning: Problems and Solutions
TLDR
An overview of the challenges and algorithms for goal-conditioned reinforcement learning is provided and how goals are represented and existing solutions are designed from different points of view are presented.
KNOW THYSELF: TRANSFERABLE VISUAL CONTROL POLICIES THROUGH ROBOT-AWARENESS
  • Hu, Kun Huang, Oleh Rybkin, Dinesh Jayaraman
  • Computer Science
  • 2022
Training visual control policies from scratch on a new robot typically requires generating large amounts of robot-specific data. How might we leverage data previously collected on another robot to
R3M: A Universal Visual Representation for Robot Manipulation
TLDR
This work pre-train a visual representation using the Ego4D human video dataset using a combination of time-contrastive learning, video-language alignment, and an L1 penalty to encourage sparse and compact representations, resulting in R3M.
Time-optimized velocity trajectory of a bounded-input double integrator with uncertainties: a solution based on PILCO
TLDR
A simulation and experiment in applying an existing model-based RL framework, PILCO, to the problem of state-to-state time-opti-mal control with bounded input in the presence of uncertainties finds Gaussian Process is employed to model dynamics, successfully reducing the effect of model biases.
VisuoSpatial Foresight for physical sequential fabric manipulation
TLDR
Results suggest that training visual dynamics models using longer, corner-based actions can improve the efficiency of fabric folding by 76% and enable a physical sequential fabric folding task that VSF could not previously perform with 90% reliability.
C-Learning: Learning to Achieve Goals via Recursive Classification
TLDR
This work lays a principled foundation for goal-conditioned RL as density estimation, providing justification for goalschooling methods used in prior work, and an off-policy variant of this algorithm allows us to predict the future state distribution of a new policy, without collecting new experience.
Comparing Reconstruction- and Contrastive-based Models for Visual Task Planning
TLDR
This work defines relevant evaluation metrics and performs a thorough study of different loss functions for state representation learning and shows that models exploiting task priors, such as Siamese networks with a simple contrastive loss, outperform reconstruction-based representations in visual task planning.
...
1
2
3
...

References

SHOWING 1-10 OF 81 REFERENCES
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
TLDR
An algorithm is developed that learns an action-conditional, predictive model of expected future observations, rewards and values from which a policy can be derived by following the gradient of the estimated value along imagined trajectories.
Learning Latent Dynamics for Planning from Pixels
TLDR
The Deep Planning Network (PlaNet) is proposed, a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space using a latent dynamics model with both deterministic and stochastic transition components.
Learning to Predict Without Looking Ahead: World Models Without Forward Prediction
TLDR
It is shown that the emerged world model, while not explicitly trained to predict the future, can help the agent learn key skills required to perform well in its environment.
SOLAR: Deep Structured Latent Representations for Model-Based Reinforcement Learning
TLDR
This work focuses on learning representations with probabilistic graphical model (PGM) structure, which allows it to devise an efficient local model method that infers dynamics from real-world rollouts with the PGM as a global prior.
Unsupervised Visuomotor Control through Distributional Planning Networks
TLDR
This work aims to learn an unsupervised embedding space under which the robot can measure progress towards a goal for itself, and enables learning effective and control-centric representations that lead to more autonomous reinforcement learning algorithms.
Model-Based Reinforcement Learning for Atari
TLDR
Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models, is described and a comparison of several model architectures is presented, including a novel architecture that yields the best results in the authors' setting.
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
TLDR
The algorithm, search on the replay buffer (SoRB), enables agents to solve sparse reward tasks over one hundred steps, and generalizes substantially better than standard RL algorithms.
Learning Latent State Spaces for Planning through Reward Prediction
TLDR
This work introduces a model-based planning framework which learns a latent reward prediction model and then plans in the latent state-space, and finds that this method can successfully learn an accurate latent rewards model in the presence of the irrelevant information while existing model- based methods fail.
Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation
TLDR
A framework for subgoal generation and planning, hierarchical visual foresight (HVF), which generates subgoal images conditioned on a goal image, and uses them for planning, and observes that the method naturally identifies semantically meaningful states as subgoals.
Exploring Model-based Planning with Policy Networks
TLDR
This paper proposes a novel MBRL algorithm, model-based policy planning (POPLIN), that combines policy networks with online planning and shows that POPLIN obtains state-of-the-art performance in the MuJoCo benchmarking environments, being about 3x more sample efficient than the state of theart algorithms, such as PETS, TD3 and SAC.
...
1
2
3
4
5
...