Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation

@article{Bing2021ComplexRM,
  title={Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation},
  author={Zhenshan Bing and Matthias Brucker and Fabrice O. Morin and Kai Huang and Alois Knoll},
  journal={IEEE transactions on neural networks and learning systems},
  year={2021},
  volume={PP}
}
Reinforcement learning algorithms, such as hindsight experience replay (HER) and hindsight goal generation (HGG), have been able to solve challenging robotic manipulation tasks in multigoal settings with sparse rewards. HER achieves its training success through hindsight replays of past experience with heuristic goals but underperforms in challenging tasks in which goals are difficult to explore. HGG enhances HER by selecting intermediate goals that are easy to achieve in the short term and… Expand

References

SHOWING 1-10 OF 35 REFERENCES
Exploration via Hindsight Goal Generation
TLDR
HGG is introduced, a novel algorithmic framework that generates valuable hindsight goals which are easy for an agent to achieve in the short term and are also potential for guiding the agent to reach the actual goal in the long term. Expand
Reverse Curriculum Generation for Reinforcement Learning
TLDR
This work proposes a method to learn goal-oriented tasks without requiring any prior knowledge other than obtaining a single state in which the task is achieved, and generates a curriculum of start states that adapts to the agent's performance, leading to efficient training on goal- oriented tasks. Expand
Energy-Based Hindsight Experience Prioritization
TLDR
An energy-based framework for prioritizing hindsight experience in robotic manipulation tasks, inspired by the work-energy principle in physics, that hypothesizes that replaying episodes that have high trajectory energy is more effective for reinforcement learning in robotics. Expand
Curiosity-Driven Experience Prioritization via Density Estimation
TLDR
A novel Curiosity-Driven Prioritization (CDP) framework to encourage the agent to over-sample those trajectories that have rare achieved goal states and the experimental results show that CDP improves both performance and sample-efficiency of reinforcement learning agents, compared to state-of-the-art methods. Expand
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
TLDR
CURIOUS is proposed, an algorithm that leverages a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Expand
Hindsight Experience Replay
TLDR
A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum. Expand
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
TLDR
A novel multi-goal RL objective based on weighted entropy is proposed, which encourages the agent to maximize the expected return, as well as to achieve more diverse goals and a maximum entropy-based prioritization framework is developed to optimize the proposed objective. Expand
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
TLDR
A suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware and following a Multi-Goal Reinforcement Learning (RL) framework are introduced. Expand
Curriculum-guided Hindsight Experience Replay
TLDR
This paper proposes to adaptively select the failed experiences for replay according to the proximity to the true goals and the curiosity of exploration over diverse pseudo goals, and adopts a human-like learning strategy that enforces more curiosity in earlier stages and changes to larger goal-proximity later. Expand
Search on the Replay Buffer: Bridging Planning and Reinforcement Learning
TLDR
The algorithm, search on the replay buffer (SoRB), enables agents to solve sparse reward tasks over one hundred steps, and generalizes substantially better than standard RL algorithms. Expand
...
1
2
3
4
...