Learning to Sequence and Blend Robot Skills via Differentiable Optimization

@article{Jaquier2022LearningTS,
  title={Learning to Sequence and Blend Robot Skills via Differentiable Optimization},
  author={No{\'e}mie Jaquier and You Zhou and Julia Starke and Tamim Asfour},
  journal={IEEE Robotics and Automation Letters},
  year={2022},
  volume={7},
  pages={8431-8438}
}
In contrast to humans and animals who naturally execute seamless motions, learning and smoothly executing sequences of actions remains a challenge in robotics. This letter introduces a novel skill-agnostic framework that learns to sequence and blend skills based on differentiable optimization. Our approach encodes sequences of previously-defined skills as quadratic programs (QP), whose parameters determine the relative importance of skills along the task. Seamless skill sequences are then… 

Figures from this paper

Hierarchical Policy Blending As Optimal Transport

This hierarchical framework adapts the weights of low-level reactive expert policies, adding a look-ahead planning layer on the parameter space of a product of expert policies and agents, paving the way for new applications of optimal transport to robot control.

References

SHOWING 1-10 OF 33 REFERENCES

Learning and Sequencing of Object-Centric Manipulation Skills for Industrial Tasks

A rapid robot skill-sequencing algorithm, where the skills are encoded by object-centric hidden semi-Markov models, which significantly reduces manual modeling efforts, while ensuring a high degree of flexibility and re-usability of learned skills.

Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation

Applying simultaneous shape and goal learning to sequences of motion primitives leads to the novel algorithm PI2 Seq, which is used to address a fundamental challenge in manipulation: improving the robustness of everyday pick-and-place tasks.

Robot learning from demonstration by constructing skill trees

It is shown that CST can be used to acquire skills from human demonstration in a dynamic continuous domain, and from both expert demonstration and learned control sequences on the uBot-5 mobile manipulator.

Using probabilistic movement primitives in robotics

A stochastic feedback controller is derived that reproduces the encoded variability of the movement and the coupling of the degrees of freedom of the robot by using a probabilistic representation.

Learning the Nonlinear Multivariate Dynamics of Motion of Robotic Manipulators

An algorithm to estimate multivariate robot motions through a Mixture of Gaussians, which allows to generalize a motion to unseen context and provides fast on-line replanning of the motion in the face of spatio-temporal perturbations.

Learning movement primitive attractor goals and sequential skills from kinesthetic demonstrations

Learning soft task priorities for safe control of humanoid robots with constrained stochastic optimization

This paper retains (1+1)-CMA-ES with covariance constrained adaptation as the best candidate to solve the problems, and shows its effectiveness on two whole-body experiments with the iCub humanoid robot.

Synthesis of complex humanoid whole-body behavior: A focus on sequencing and tasks transitions

A novel approach to deal with transitions while performing a sequence of dynamic tasks with a humanoid robot using a strategy based on weights to represent their relative importance is presented.

Probabilistic progress prediction and sequencing of concurrent movement primitives

This paper introduces a concept to learn and estimate the progress of individual MPs from a low number of demonstrations and proposes a representation of the task that incorporates several concurrent sequences of MPs, which allows to learning and reproduce coordinated bi-manual movement tasks robustly.

Multiple task optimization with a mixture of controllers for motion generation

The main contribution of this paper is the development of a framework which allows for automatic derivation of suitable mixture coefficients which represent priorities and thereby enables to flexibly impose priorities for pursuing different goals in parallel.