• Corpus ID: 13944350

Stabilization of Biped Robot based on Two mode Q-learning

  title={Stabilization of Biped Robot based on Two mode Q-learning},
  author={Kui-Hong Park and Jun Jo and Jong-Hwan Kim},
Abstract In this paper, two mode Q-learning, an extension of Q-learning, is used to stabilize the Zero Moment Point (ZMP)of a biped robot in the standing posture. In two mode Q-learning, the experiences of both success and failureof an agent are used for fast convergence. To demonstrate the effectiveness of two mode Q-learning againstconventional Q-learning, the property of convergence is investigated through simulation in a grid world. Thispaper also presents the experimental results of the… 

Figures and Tables from this paper

Q Learning based Reinforcement Learning Approach to Bipedal Walking Control

Stabilization of an equivalent double inverted pendulum representing a bipedal robot has been successfully implemented for balancing the pole angles in the required range using Q learning in Reinforcement Learning.

Adaptive PD Controller Modeled via Support Vector Regression for a Biped Robot

The intelligent computing control technique based on support vector regression (SVR) is used, and the results show that the implemented gait combined with the SVR controller can be used to control this biped robot.

Sagittal stability PD controllers for a biped robot using a neurofuzzy network and an SVR

The results show that the implemented gait combined either with the SVR controller or with the TSK NF network controller can be used to control this biped robot.

A Reinforcement Learning Approach to Autonomous Speed Control in Robotic Systems

The Q-learning algorithm, a widely used model-free algorithm to find the optimal speed control function for a fast moving train on a fixed track, is designed and implemented and the performance of the learning models over a physical environment is compared.

SVR Versus Neural-Fuzzy Network Controllers for the Sagittal Balance of a Biped Robot

Two alternative intelligent computing control techniques were compared: one based on support vector regression (SVR) and another based on a first-order Takagi-Sugeno-Kang (TSK)-type neural-fuzzy (NF) network.

A Human-Robot Collaborative Reinforcement Learning Algorithm

This paper presents a new reinforcement learning algorithm that enables collaborative learning between a robot and a human. The algorithm which is based on the Q(λ) approach expedites the learning

International Conference on Robotics and Applications ( RA 2005 ) , Cambridge , U . S . A . COLLABORATIVE Q ( λ ) REINFORCEMENT LEARNING ALGORITHM-A PROMISING ROBOT LEARNING FRAMEWORK

The design and implementation of a new reinforcement learning (RL) based algorithm that allows several learning agents to acquire knowledge from each other and proved to accelerate learning in navigation robotic problem.

Human-Robot Collaborative Learning of a Bag Shaking Trajectory

Experimental results that support the hypothesis of evaluating whether learning is faster while human collaboration is triggered than when the system functions autonomously are demonstrated.

Human-Robot Collaborative Learning System for Inspection

  • K. UriS. HelmanE. Yael
  • Computer Science
    2006 IEEE International Conference on Systems, Man and Cybernetics
  • 2006
A collaborative reinforcement learning algorithm, CQ(lambda), designed to accelerate learning by integrating a human operator into the learning process, which was tested on a Motoman UP-6 fixed-arm robot required to empty the contents of a suspicious bag.

Parameterized gait pattern generator based on linear inverted pendulum model with natural ZMP references

A parameterized gait generator based on linear inverted pendulum model (LIPM) theory, which allows users to generate a natural gait pattern with desired step sizes and provides a concept for users to generated gait patterns with self-defined ZMP references by using different components.



Development of a biped walking robot compensating for three-axis moment by trunk motion

A control method of dynamic bipedwalking for a biped walking robot to compensate for the three-axis (pitch, roll and yaw-axis) moment on an arbitrary planned ZMP by trunk motion is introduced.

Humanoid Robot HanSaRam: Recent Progress and Developments

This talk deals with two issues for humanoid robot, the navigation within complex environments and a novel algorithm, which can modify a walking period and a step length in both sagittal and lateral planes, is presented.

Contribution to the synthesis of biped gait.

The connection between the dynamics of an object and the algorithmic level has been modified in this paper, based on two-level control, in introducing feedbacks, that is, a system of regulators at the level of the formed typed of gait only.

A biped walking robot having a ZMP measurement system using universal force-moment sensors

A method of measuring the ZMP throughout the whole walking phase is proposed, and a newly developed biped walking robot that has a ZMP measurement system using two universal force-moment sensors is explained.

Making Reinforcement Learning Work on Real Robots

HEDGER is a safe value-function approximation algorithm designed to be used with continuous state and action spaces, and with sparse reward functions, and JAQL is the general framework for reinforcement learning on real robots, and deals with the problems of initial knowledge and robot safety.

Incremental multi-step Q-learning

A novel incremental algorithm that combines Q-learning with the TD(λ) return estimation process, which is typically used in actor-critic learning, leading to faster learning and also helping to alleviate the non-Markovian effect of coarse state-space quatization.

Design and development of research platform for perception-action integration in humanoid robot: H6

The H6 is expected to be a common test-bed for experiment and discussion for various aspects of intelligent humanoid robotics.

The development of Honda humanoid robot

Due to its unique posture stability control, the Honda humanoid robot is able to maintain its balance despite unexpected complications such as uneven ground surfaces and to perform simple operations via wireless teleoperation.

Multiagent reinforcement learning using function approximation

Two new multiagent based domain independent coordination mechanisms for reinforcement learning; multiple agents do not require explicit communication among themselves to learn coordinated behavior.