Hybrid Zero Dynamics Inspired Feedback Control Policy Design for 3D Bipedal Locomotion using Reinforcement Learning

  title={Hybrid Zero Dynamics Inspired Feedback Control Policy Design for 3D Bipedal Locomotion using Reinforcement Learning},
  author={Guillermo A. Castillo and Bowen Weng and W. Zhang and Ayonga Hereid},
  journal={2020 IEEE International Conference on Robotics and Automation (ICRA)},
This paper presents a novel model-free reinforcement learning (RL) framework to design feedback control policies for 3D bipedal walking. Existing RL algorithms are often trained in an end-to-end manner or rely on prior knowledge of some reference joint trajectories. Different from these studies, we propose a novel policy structure that appropriately incorporates physical insights gained from the hybrid nature of the walking dynamics and the well-established hybrid zero dynamics approach for 3D… 

Figures from this paper

Reinforcement Learning-Based Cascade Motion Policy Design for Robust 3D Bipedal Locomotion

A novel reinforcement learning (RL) framework to design cascade feedback control policies for 3D bipedal locomotion that learns stable and robust walking gaits from scratch and allows the controller to realize omnidirectional walking with accurate tracking of the desired velocity and heading angle.

Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot

In this paper, a hierarchical and robust framework for learning bipedal locomotion is presented and successfully implemented on the 3D biped robot Digit built by Agility Robotics. We propose a

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

  • Zhongyu LiXuxin Cheng K. Sreenath
  • Engineering, Biology
    2021 IEEE International Conference on Robotics and Automation (ICRA)
  • 2021
A model-free reinforcement learning framework for training robust locomotion policies in simulation, which can be transferred to a real bipedal Cassie robot, and domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics.

Hybrid and dynamic policy gradient optimization for bipedal robot locomotion

This work proposes a novel policy gradient reinforcement learning for biped locomotion, allowing the control policy to be simultaneously optimized by multiple criteria using a dynamic mechanism.

Data-Efficient and Safe Learning for Humanoid Locomotion Aided by a Dynamic Balancing Model

A novel Markov Decision Process (MDP) for safe and data-efficient learning for humanoid locomotion aided by a dynamic balancing model and the scalability of the procedure to various types of humanoid robots and walking is proposed.

Reward-Adaptive Reinforcement Learning: Dynamic Policy Gradient Optimization for Bipedal Locomotion.

This work proposes a novel reward-adaptive reinforcement learning method for biped locomotion, allowing the control policy to be simultaneously optimized by multiple criteria using a dynamic mechanism, leading to hybrid policy gradients.

Geometric Control and Learning for Dynamic Legged Robots

It is shown that Euler-parametrization based orientation control in 3D requires greater input to stabilize on average, not just for large-error situations, and a model-based gait library design and deep learning are combined to yield a near constant-time and constant-memory policy for fast, stable and robust bipedal robot locomotion.

Adapting Rapid Motor Adaptation for Bipedal Robots

This paper proposes A-RMA (Adapting RMA), which additionally adapts the base policy for the imperfect extrinsics estimator by finetuning it using model-free RL.

Velocity Regulation of 3D Bipedal Walking Robots with Uncertain Dynamics Through Adaptive Neural Network Controller

This paper addresses the uncertainties in the robot dynamics from the perspective of the reduced dimensional representation of virtual constraints and proposes the integration of an adaptive neural network-based controller to regulate the robot velocity in the presence of model parameter uncertainties.

Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

A robust and a fast feedback control law for bipedal walking on terrains with varying slopes, via a model-free and a gradient free learning algorithm, Augmented Random Search (ARS), in the two robot platforms Rabbit and Digit.



Feedback Control For Cassie With Deep Reinforcement Learning

The effectiveness of DRL is demonstrated using a realistic model of Cassie, a bipedal robot, and robustness is demonstrated through several challenging tests, including sensory delay, walking blindly on irregular terrain and unexpected pushes at the pelvis.

Reinforcement Learning Meets Hybrid Zero Dynamics: A Case Study for RABBIT

A novel approach for the design of feedback controllers using Reinforcement Learning (RL) and Hybrid Zero Dynamics (HZD) that embeds the HZD framework into the policy learning and results in a stable and robust control policy that is able to track variable speed within a continuous interval.

A Compliant Hybrid Zero Dynamics Controller for Stable, Efficient and Fast Bipedal Walking on MABEL

Five experiments are presented that highlight different aspects of MABEL and the feedback design method, ranging from basic elements such as stable walking and robustness under perturbations, to energy efficiency and a walking speed of 1.5 m s−1 (3.4 mph).

Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie

This paper proposes a practical method that allows the reward function to be fully redefined on each successive design iteration while limiting the deviation from the previous iteration, and demonstrates the effectiveness of this iterative-design approach on the bipedal robot Cassie.

DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning

This paper aims to learn a variety of environment-aware locomotion skills with a limited amount of prior knowledge by adopting a two-level hierarchical control framework and training both levels using deep reinforcement learning.

Optimization and stabilization of trajectories for constrained dynamical systems

A trajectory optimization algorithm (DIRCON) is introduced that extends the direct collocation method, naturally incorporating manifold constraints to produce a nominal trajectory with third-order integration accuracy-a critical feature for achieving reliable tracking control.

Dynamic Humanoid Locomotion: A Scalable Formulation for HZD Gait Optimization

A methodology that allows for fast and reliable generation of dynamic robotic walking gaits through the HZD framework, even in the presence of underactuation, and develops a defect-variable substitution formulation to simplify expressions, which ultimately allows for compact analytic Jacobians of the constraints.

Continuous control with deep reinforcement learning

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

Human-inspired control of bipedal robots via control lyapunov functions and quadratic programs

  • A. Ames
  • Engineering, Computer Science
    HSCC '13
  • 2013
The end result is the generation of bipedal robotic walking that is remarkably human-like and is experimentally realizable, as evidenced by the implementation of the resulting controllers on multiple robotic platforms.

Spring-Mass Walking With ATRIAS in 3D: Robust Gait Control Spanning Zero to 4.3 KPH on a Heavily Underactuated Bipedal Robot

We present a reduced-order approach for dynamic and efficient bipedal control, culminating in 3D balancing and walking with ATRIAS, a heavily underactuated bipedal robot. These results are a