Learning Stable Normalizing-Flow Control for Robotic Manipulation

@article{Khader2021LearningSN,
  title={Learning Stable Normalizing-Flow Control for Robotic Manipulation},
  author={Shahbaz Abdul Khader and Hang Yin and Pietro Falco and Danica Kragic},
  journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2021},
  pages={1644-1650}
}
  • S. A. Khader, Hang Yin, D. Kragic
  • Published 30 October 2020
  • Computer Science
  • 2021 IEEE International Conference on Robotics and Automation (ICRA)
Reinforcement Learning (RL) of robotic manipulation skills, despite its impressive successes, stands to benefit from incorporating domain knowledge from control theory. One of the most important properties that is of interest is control stability. Ideally, one would like to achieve stability guarantees while staying within the framework of state-of-the-art deep RL algorithms. Such a solution does not exist in general, especially one that scales to complex manipulation tasks. We contribute… 

Figures and Tables from this paper

Learning Deep Neural Policies with Stability Guarantees

TLDR
This work achieves unconditional stability in deep reinforcement learning by deriving an interpretable deep policy structure based on the energy shaping control of Lagrangian systems and establishing stability during physical interaction with an unknown environment based on passivity.

Learning Deep Energy Shaping Policies for Stability-Guaranteed Manipulation

TLDR
This work achieves stability guaranteeing DRL in a model-free framework that is general enough for contact-rich manipulation tasks and demonstrates, to the best of the knowledge, the first DRL with stability guarantee on a real robotic manipulator.

Embedding Koopman Optimal Control in Robot Policy Learning

TLDR
This work embeds a linear-quadratic-regulator formulation with a Koopman representation, thus exhibiting the tractability from a closed-form solution and richness from a non-convex neural network and demonstrates real world application in a robot pivoting task.

Multiscale Sensor Fusion and Continuous Control with Neural CDEs

TLDR
InFuser is presented, a unified architecture that trains continuous time-policies with Neural Controlled Differential Equations (CDEs) and evolves a single latent state representation over time that enables policies that can react to multi-frequency multi-sensory feedback for truly end-to-end visuomotor control, without discrete-time assumptions.

References

SHOWING 1-10 OF 28 REFERENCES

Stability-Guaranteed Reinforcement Learning for Contact-Rich Manipulation

TLDR
This work introduces the term all-the-time-stability that unambiguously means that every possible rollout should be stability certified in stable RL and introduces a novel policy search algorithm that is inspired by Cross-Entropy Method and inherently guarantees stability.

Learning contact-rich manipulation skills with guided policy search

TLDR
This paper extends a recently developed policy search method and uses it to learn a range of dynamic manipulation behaviors with highly general policy representations, without using known models or example demonstrations, and shows that this method can acquire fast, fluent behaviors after only minutes of interaction time.

Learning Variable Impedance Control for Contact Sensitive Tasks

TLDR
This letter investigates how the choice of action space can give robust performance in presence of contact uncertainties and proposes learning a policy giving as output impedance and desired position in joint space and compares the performance of that approach to torque and position control under different contact uncertainties.

End-to-End Training of Deep Visuomotor Policies

TLDR
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method.

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

TLDR
This work proposes a controller architecture that combines a model-free RL-based controller with model-based controllers utilizing control barrier functions (CBFs) and on-line learning of the unknown system dynamics, in order to ensure safety during learning.

Control Regularization for Reduced Variance Reinforcement Learning

TLDR
A functional regularization approach to augmenting model-free RL by regularizing the behavior of the deep policy to be similar to a policy prior, which yields a bias-variance trade-off and is validated empirically on a range of settings.

Safe Model-based Reinforcement Learning with Stability Guarantees

TLDR
This paper presents a learning algorithm that explicitly considers safety, defined in terms of stability guarantees, and extends control-theoretic results on Lyapunov stability verification and shows how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates.

Data-Efficient Model Learning and Prediction for Contact-Rich Manipulation Tasks

TLDR
This letter proposes a method that explicitly adopts a specific hybrid structure for the model while leveraging the uncertainty representation and data-efficiency of Gaussian process to close the gap in forward dynamics models and multi-step prediction of state variables for contact-rich manipulation.

Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks

TLDR
It is shown that VICES improves sample efficiency, maintains low energy consumption, and ensures safety across all three experimental setups, and RL policies learned with VICES can transfer across different robot models in simulation, and from simulation to real for the same robot.

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

TLDR
This work uses self-supervision to learn a compact and multimodal representation of sensory inputs, which can then be used to improve the sample efficiency of the policy learning of deep reinforcement learning algorithms.