• Corpus ID: 19181872

Reverse Curriculum Generation for Reinforcement Learning

@article{Florensa2017ReverseCG,
  title={Reverse Curriculum Generation for Reinforcement Learning},
  author={Carlos Florensa and David Held and Markus Wulfmeier and Michael Zhang and P. Abbeel},
  journal={ArXiv},
  year={2017},
  volume={abs/1707.05300}
}
Many relevant tasks require an agent to reach a certain state, or to manipulate objects into a desired configuration. [...] Key Method Instead, we propose a method to learn these tasks without requiring any prior knowledge other than obtaining a single state in which the task is achieved. The robot is trained in reverse, gradually learning to reach the goal from a set of start states increasingly far from the goal.Expand
Region Growing Curriculum Generation for Reinforcement Learning
TLDR
A method based on region-growing that allows learning in an environment with any pair of initial and goal states and a method to adaptively adjust expansion of the growing region that allows automatic adjustment of the key exploration hyperparameter to environments with different requirements is described.
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
TLDR
An algorithm to solve the distant goal-reaching task by using search at training time to automatically generate a curriculum of intermediate states and is able to solve very long horizons manipulation and navigation tasks, tasks that prior goalconditioned methods and methods based on graph search fail to solve.
Exploration via Hindsight Goal Generation
TLDR
HGG is introduced, a novel algorithmic framework that generates valuable hindsight goals which are easy for an agent to achieve in the short term and are also potential for guiding the agent to reach the actual goal in the long term.
Automatic Goal Generation for Reinforcement Learning Agents
TLDR
This work uses a generator network to propose tasks for the agent to try to achieve, specified as goal states, and shows that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment.
Goal-conditioned Imitation Learning
Designing rewards for Reinforcement Learning (RL) is challenging because it needs to convey the desired task, be efficient to optimize, and be easy to compute. The latter is particularly problematic
Follow the Object: Curriculum Learning for Manipulation Tasks with Imagined Goals
TLDR
The proposed algorithm, Follow the Object (FO), has been evaluated on 7 MuJoCo environments requiring increasing degree of exploration, and has achieved higher success rates compared to alternative algorithms.
TendencyRL: Multi-stage Discriminative Hints for Efficient Goal-Oriented Reverse Curriculum Learning
  • Chen Wang, Junfeng Ding, +4 authors Cewu Lu
  • Computer Science
    2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2019
TLDR
This work extensively study the advantages of TRL on the standard long-term goal-oriented robotics domains such as pick-and-place, and shows that TRL performs more efficiently and robustly than prior approaches in tasks with large state space and can solve difficult robot manipulation challenges directly from perception.
Overcoming Exploration in Reinforcement Learning with Demonstrations
TLDR
This work uses demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm.
Goal-conditioned Imitation Learning
TLDR
Different approaches to incorporate demonstrations to drastically speed up the convergence to a policy able to reach any goal, also surpassing the performance of an agent trained with other Imitation Learning algorithms are investigated.
Automatic Curriculum Learning through Value Disagreement
TLDR
This work introduces a goal proposal module that prioritizes goals that maximize the epistemic uncertainty of the Q-function of the policy, and samples goals that are neither too hard nor too easy for the agent to solve, hence enabling continual improvement.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 43 REFERENCES
Automatic Goal Generation for Reinforcement Learning Agents
TLDR
This work uses a generator network to propose tasks for the agent to try to achieve, specified as goal states, and shows that, by using this framework, an agent can efficiently and automatically learn to perform a wide set of tasks without requiring any prior knowledge of its environment.
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
TLDR
The results show that by making extensive use of off-policy data and replay, it is possible to find control policies that robustly grasp objects and stack them and hint that it may soon be feasible to train successful stacking policies by collecting interactions on real robots.
End-to-End Training of Deep Visuomotor Policies
TLDR
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.
Active learning of inverse models with intrinsically motivated goal exploration in robots
We introduce the Self-Adaptive Goal Generation Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture as an intrinsically motivated goal exploration mechanism which allows active learning of
Purposive behavior acquisition for a real robot by vision-based reinforcement learning
TLDR
A method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal by using Learning from Easy Missions (or LEM), which reduces the learning time from exponential to almost linear order in the size of the state space.
Reinforcement learning of motor skills with policy gradients
TLDR
This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.
Exploration from Demonstration for Interactive Reinforcement Learning
TLDR
This work presents a model-free policy-based approach called Exploration from Demonstration (EfD) that uses human demonstrations to guide search space exploration and shows how EfD scales to large problems and provides convergence speed-ups over traditional exploration and interactive learning methods.
Continuous control with deep reinforcement learning
TLDR
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Benchmarking Deep Reinforcement Learning for Continuous Control
TLDR
This work presents a benchmark suite of continuous control tasks, including classic tasks like cart-pole swing-up, tasks with very high state and action dimensionality such as 3D humanoid locomotion, task with partial observations, and tasks with hierarchical structure.
A Survey on Policy Search for Robotics
TLDR
This work classifies model-free methods based on their policy evaluation strategy, policy update strategy, and exploration strategy and presents a unified view on existing algorithms.
...
1
2
3
4
5
...