Reverse Curriculum Generation for Reinforcement Learning
@article{Florensa2017ReverseCG, title={Reverse Curriculum Generation for Reinforcement Learning}, author={Carlos Florensa and David Held and Markus Wulfmeier and Michael Zhang and P. Abbeel}, journal={ArXiv}, year={2017}, volume={abs/1707.05300} }
Many relevant tasks require an agent to reach a certain state, or to manipulate objects into a desired configuration. [] Key Method Instead, we propose a method to learn these tasks without requiring any prior knowledge other than obtaining a single state in which the task is achieved. The robot is trained in reverse, gradually learning to reach the goal from a set of start states increasingly far from the goal.
262 Citations
Region Growing Curriculum Generation for Reinforcement Learning
- Computer ScienceArXiv
- 2018
A method based on region-growing that allows learning in an environment with any pair of initial and goal states and a method to adaptively adjust expansion of the growing region that allows automatic adjustment of the key exploration hyperparameter to environments with different requirements is described.
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
- Computer ScienceArXiv
- 2021
An algorithm to solve the distant goal-reaching task by using search at training time to automatically generate a curriculum of intermediate states and is able to solve very long horizons manipulation and navigation tasks, tasks that prior goalconditioned methods and methods based on graph search fail to solve.
Exploration via Hindsight Goal Generation
- Computer ScienceNeurIPS
- 2019
HGG is introduced, a novel algorithmic framework that generates valuable hindsight goals which are easy for an agent to achieve in the short term and are also potential for guiding the agent to reach the actual goal in the long term.
Automatic Goal Generation for Reinforcement Learning Agents
- Computer ScienceICML
- 2018
This work uses a generator network to propose tasks for the agent to try to achieve, specified as goal states, and shows that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment.
Goal-conditioned Imitation Learning
- Computer Science
- 2019
A novel algorithm goalGAIL is proposed, which incorporates demonstrations to drastically speed up the convergence to a policy able to reach any goal, surpassing the performance of an agent trained with other Imitation Learning algorithms.
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals
- Computer ScienceArXiv
- 2020
The proposed algorithm, Follow the Object (FO), has been evaluated on 7 MuJoCo environments requiring increasing degree of exploration, and has achieved higher success rates compared to alternative algorithms.
TendencyRL: Multi-stage Discriminative Hints for Efficient Goal-Oriented Reverse Curriculum Learning
- Computer Science2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2019
This work extensively study the advantages of TRL on the standard long-term goal-oriented robotics domains such as pick-and-place, and shows that TRL performs more efficiently and robustly than prior approaches in tasks with large state space and can solve difficult robot manipulation challenges directly from perception.
Overcoming Exploration in Reinforcement Learning with Demonstrations
- Computer Science2018 IEEE International Conference on Robotics and Automation (ICRA)
- 2018
This work uses demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm.
Goal-conditioned Imitation Learning
- Computer ScienceNeurIPS
- 2019
Different approaches to incorporate demonstrations to drastically speed up the convergence to a policy able to reach any goal, also surpassing the performance of an agent trained with other Imitation Learning algorithms are investigated.
Automatic Curriculum Learning through Value Disagreement
- Computer ScienceNeurIPS
- 2020
This work introduces a goal proposal module that prioritizes goals that maximize the epistemic uncertainty of the Q-function of the policy, and samples goals that are neither too hard nor too easy for the agent to solve, hence enabling continual improvement.
References
SHOWING 1-10 OF 43 REFERENCES
Automatic Goal Generation for Reinforcement Learning Agents
- Computer ScienceICML
- 2018
This work uses a generator network to propose tasks for the agent to try to achieve, specified as goal states, and shows that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment.
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
- Computer ScienceArXiv
- 2017
The results show that by making extensive use of off-policy data and replay, it is possible to find control policies that robustly grasp objects and stack them and hint that it may soon be feasible to train successful stacking policies by collecting interactions on real robots.
End-to-End Training of Deep Visuomotor Policies
- Computer ScienceJ. Mach. Learn. Res.
- 2016
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
Active learning of inverse models with intrinsically motivated goal exploration in robots
- Computer ScienceRobotics Auton. Syst.
- 2013
Purposive behavior acquisition for a real robot by vision-based reinforcement learning
- Computer ScienceMachine Learning
- 2004
A method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal by using Learning from Easy Missions (or LEM), which reduces the learning time from exponential to almost linear order in the size of the state space.
Reinforcement learning of motor skills with policy gradients
- Computer ScienceNeural Networks
- 2008
Exploration from Demonstration for Interactive Reinforcement Learning
- Computer ScienceAAMAS
- 2016
This work presents a model-free policy-based approach called Exploration from Demonstration (EfD) that uses human demonstrations to guide search space exploration and shows how EfD scales to large problems and provides convergence speed-ups over traditional exploration and interactive learning methods.
Continuous control with deep reinforcement learning
- Computer ScienceICLR
- 2016
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Benchmarking Deep Reinforcement Learning for Continuous Control
- Computer ScienceICML
- 2016
This work presents a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, task with partial observations, and tasks with hierarchical structure.
A Survey on Policy Search for Robotics
- Computer ScienceFound. Trends Robotics
- 2013
This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms.