Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning

  title={Blind Bipedal Stair Traversal via Sim-to-Real Reinforcement Learning},
  author={Jonah Siekmann and Kevin R. Green and John Warila and Alan Fern and Jonathan W. Hurst},
Accurate and precise terrain estimation is a difficult problem for robot locomotion in real-world environments. Thus, it is useful to have systems that do not depend on accurate estimation to the point of fragility. In this paper, we explore the limits of such an approach by investigating the problem of traversing stair-like terrain without any external perception or terrain models on a bipedal robot. For such blind bipedal platforms, the problem appears difficult (even for humans) due to the… 

Figures and Tables from this paper

Learning Bipedal Walking On Planned Footsteps For Humanoid Robots

This paper shows that simply feeding the upcoming 2 steps to the policy iscient to achieve omnidirectional walking, turning in place, standing, and climbing stairs, and employs curriculum learning on the complexity of terrains, and circumvents the need for reference motions or pre-trained weights.

Learning robust perceptive locomotion for quadrupedal robots in the wild

An attention-based recurrent encoder is leverage that integrates proprioceptive and exteroceptive input and learns to seamlessly combine the different perception modalities without resorting to heuristics to create a legged locomotion controller with high robustness and speed.

Sim-to-Real Learning of Footstep-Constrained Bipedal Dynamic Walking

  • Helei DuanA. Malik J. Hurst
  • Biology, Computer Science
    2022 International Conference on Robotics and Automation (ICRA)
  • 2022
This paper develops an RL formulation for training dynamic gait controllers that can respond to specified touchdown locations and uses supervised learning to induce a transition model for accurately predicting the next touchdown locations that the controller can achieve given the robot's proprioceptive observations.

Visual-Locomotion: Learning to Walk on Complex Terrains with Vision

A framework to train a vision-based locomotion controller which enables a quadrupedal robot to traverse uneven environments and is validated on a real robot to walk over a series of gaps and climbing up a platform.

Learning Coordinated Terrain-Adaptive Locomotion by Imitating a Centroidal Dynamics Planner

This work shows that terrain adaptive controllers can be obtained by training policies to imitate trajectories that have been planned over procedural terrains by a non-linear solver and shows that the learned policies transfer to unseen terrains and can be fine-tuned to dynamically traverse challenging terrains that require precise foot placements and are very hard to solve with standard RL.

Adapting Rapid Motor Adaptation for Bipedal Robots

This paper proposes A-RMA (Adapting RMA), which additionally adapts the base policy for the imperfect extrinsics estimator by tuning it using model-free RL.

Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

It is demonstrated that a modest amount of real-world training can substantially improve performance during deployment, and this enables a real A1 quadrupedal robot to autonomously fine-tune multiple locomotion skills in a range of environments, including an outdoor lawn and a variety of indoor terrains.

Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

This work proposes a new control pipeline, wherein the high-level trajectory modulator shapes the end-foot ellipsoidal trajectories, and the low-level gait controller regulates the torso and ankle orientation, and uses a linear PD control law.

Towards Real Robot Learning in the Wild: A Case Study in Bipedal Locomotion

This paper demonstrates how a small bipedal robot can autonomously learn to walk with minimal human intervention and with minimal instrumentation of the environment, using data-efficient off-policy deep reinforcement learning to learn towalk end-to-end, directly on hardware.

Perceptive Locomotion through Nonlinear Model Predictive Control

—Dynamic locomotion in rough terrain requires ac- curate foot placement, collision avoidance, and planning of the underactuated dynamics of the system. Reliably optimizing for such motions and



Learning quadrupedal locomotion over challenging terrain

The presented work indicates that robust locomotion in natural environments can be achieved by training in simple domains.

Heuristic Planning for Rough Terrain Locomotion in Presence of External Disturbances and Variable Perception Quality

A heuristic-based planning approach is proposed that enables a quadruped robot to successfully traverse a significantly rough terrain, in absence of visual feedback, and includes reflexes, triggered in specific situations, and the possibility to estimate online an unknown time-varying disturbance and compensate for it.

Sim-to-Real Learning of All Common Bipedal Gaits via Periodic Reward Composition

A reward-specification framework based on composing simple probabilistic periodic costs on basic forces and velocities is proposed and instantiate this framework to define a parametric reward function with intuitive settings for all common bipedal gaits - standing, walking, hopping, running, and skipping.

Feedback Control For Cassie With Deep Reinforcement Learning

The effectiveness of DRL is demonstrated using a realistic model of Cassie, a bipedal robot, and robustness is demonstrated through several challenging tests, including sensory delay, walking blindly on irregular terrain and unexpected pushes at the pelvis.

Learning Memory-Based Control for Human-Scale Bipedal Locomotion

This work considers recurrent neural networks for sim-to-real biped locomotion, allowing for policies that learn to use internal memory to model important physical properties and shows that RNNs could use their learned memory states to perform online system identification by encoding parameters of the dynamics into memory.

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

By randomizing the dynamics of the simulator during training, this paper is able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained.

A Finite-State Machine for Accommodating Unexpected Large Ground-Height Variations in Bipedal Robot Walking

A feedback controller that allows MABEL, which is a kneed planar bipedal robot with 1-m-long legs, to accommodate terrain that presents large unexpected increases and decreases in height is presented.

GPU-accelerated real-time 3D tracking for humanoid locomotion and stair climbing

A robust model-based three-dimensional tracking system by programmable graphics hardware to operate online at frame-rate during locomotion of a humanoid robot and recovers the full 6 degree-of- freedom pose of viewable objects relative to the robot.

Virtual Model Control: An Intuitive Approach for Bipedal Locomotion

This paper has successfully compelled a simulated seven-link planar biped to walk blindly up and down slopes and over rolling terrain and described how the algorithm can be augmented for rough terrain walking based on geometric consideration.

MIT Cheetah 3: Design and Control of a Robust, Dynamic Quadruped Robot

A new leg design is presented that includes proprioceptive actuation on the abduction/adduction degrees of freedom in addition to an expanded range of motion on the hips and knees, and represents a promising step toward a platform capable of generalized dynamic legged locomotion.