• Corpus ID: 240070909

Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives

  title={Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives},
  author={Murtaza Dalal and Deepak Pathak and Ruslan Salakhutdinov},
  booktitle={Neural Information Processing Systems},
Despite the potential of reinforcement learning (RL) for building general-purpose robotic systems, training RL agents to solve robotics tasks still remains challenging due to the difficulty of exploration in purely continuous action spaces. Addressing this problem is an active area of research with the majority of focus on improving RL methods via better optimization or more efficient exploration. An alternate but important component to consider improving is the interface of the RL algorithm… 

Robot Learning of Mobile Manipulation With Reachability Behavior Priors

This work considers the problem of optimal base placement and the subsequent decision of whether to activate the arm for reaching a 6D target, and devise a novel Hybrid RL (HyRL) method that handles discrete and continuous actions jointly, resorting to the Gumbel-Softmax reparameterization.

Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks

This work presents a new method, able to leverage demonstrations and episodes collected online in any sparse-reward environment with any off-policy algorithm, based on a reward bonus given to demonstrations and successful episodes, encouraging expert imitation and self-imitation.

Exploring with Sticky Mittens: Reinforcement Learning with Expert Interventions via Option Templates

This work proposes a framework for leveraging expert intervention as allowing the agent to execute option templates before learning an implementation, and evaluates its approach on three challenging reinforcement learning problems, showing that it outperforms state-of-the-art approaches by two orders of magnitude.

TAPS: Task-Agnostic Policy Sequencing

This work presents Task-Agnostic Policy Sequencing (TAPS), a scalable framework for training manipulation primitives and coordinating their geometric dependencies at plan-time to solve long-horizon tasks never seen by any primitive during training.

Accelerating Reinforcement Learning for Autonomous Driving using Task-Agnostic and Ego-Centric Motion Skills

Validations in various challenging driving scenarios demonstrate that the proposed method, TaEc-RL, outperforms its counterparts significantly in learning ef ficiency and task performance.

Hierarchical Primitive Composition: Simultaneous Activation of Skills with Inconsistent Action Dimensions in Multiple Hierarchies

This study sought to devise an algorithm that can properly orchestrate the skills with different action spaces via multiplicative Gaussian distributions, which highly increases the reusability.

Active Task Randomization: Learning Visuomotor Skills for Sequential Manipulation by Proposing Feasible and Novel Tasks

This work introduces Active Task Randomization (ATR), an approach that learns visuomotor skills for sequential manipulation by automatically creating feasible and novel tasks in simulation by developing a relational neural network that maps each task parameter into a compact embedding.

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

Analysis and ablations reveal that both continuous and discrete components are beneficial, and that the learned hierarchical skills are most useful in sparse-reward settings, as they encourage directed exploration of task-relevant parts of the state space.

Skill-based Model-based Reinforcement Learning

A Ski ll-based Mo del-based RL framework ( SkiMo) is proposed that enables planning in the skill space using a skill dynamics model, which directly predicts the skill outcomes, rather than predicting all small details in the intermediate states, step by step.

PLATO: Predicting Latent Affordances Through Object-Centric Play

Predicting Latent Affordances Through Object-Centric Play (PLATO), outperforms existing methods on complex manipulation tasks in both 2D and 3D object manipulation simulation and real world environments for diverse types of interactions.



Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

An open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks is proposed to make it possible to develop algorithms that generalize to accelerate the acquisition of entirely new, held-out tasks.

PLAS: Latent Action Space for Offline Reinforcement Learning

This work proposes to simply learn the Policy in the Latent Action Space (PLAS) such that this requirement is naturally satisfied, and demonstrates that this method provides competitive performance consistently across various continuous control tasks and different types of datasets, outperforming existing offline reinforcement learning methods with explicit constraints.

Efficient Bimanual Manipulation Using Learned Task Schemas

It is shown that explicitly modeling the schema’s state-independence can yield significant improvements in sample efficiency for model-free reinforcement learning algorithms and can be transferred to solve related tasks, by simply re-learning the parameterizations with which the skills are invoked.

Planning to Explore via Self-Supervised World Models

Without any training supervision or task-specific interaction, Plan2Explore outperforms prior self-supervised exploration methods, and in fact, almost matches the performances oracle which has access to rewards.

Data-Efficient Hierarchical Reinforcement Learning

This paper studies how to develop HRL algorithms that are general, in that they do not make onerous additional assumptions beyond standard RL algorithms, and efficient, in the sense that they can be used with modest numbers of interaction samples, making them suitable for real-world problems such as robotic control.

Learning motor primitives for robotics

  • J. KoberJan Peters
  • Computer Science
    2009 IEEE International Conference on Robotics and Automation
  • 2009
It is shown that two new motor skills, i.e., Ball-in-a-Cup and Ball-Paddling, can be learned on a real Barrett WAM robot arm at a pace similar to human learning while achieving a significantly more reliable final performance.

Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning

This work simplifies the long-horizon policy learning problem by using a novel data-relabeling algorithm for learning goal-conditioned hierarchical policies, where the low-level only acts for a fixed number of steps, regardless of the goal achieved.

Latent Skill Planning for Exploration and Transfer

This paper leverages the idea of partial amortization for fast adaptation at test time in single tasks as well as in transfer from one task to another, as compared to competitive baselines.

Neural Dynamic Policies for End-to-End Sensorimotor Learning

NDPs are proposed that make predictions in trajectory distribution space as opposed to prior policy learning methods where actions represent the raw control space and outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks for both imitation and reinforcement learning setups.