A reward allocation method for reinforcement learning in stabilizing control of T-inverted pendulum

Abstract

Reinforcement learning is a type of machine learning methods that does not require a detailed teaching signal by a human, which is expected to be applied to real robots. In its application to real robots, the learning processes are required to be finished in a short learning period of time. A reinforcement learning method of non-bootstrap type has fast convergence speeds in the tasks such as Sutton's maze problem that aims to reach a target state in a minimum time. However, this method is difficult to learn a task of keeping a stable state as long as possible. This paper improves a reward allocation method for stabilizing control tasks. The validity of our method is demonstrated through simulation for stabilizing control of T-inverted pendulum. Our proposed method can acquire a policy of keeping a stable state within a short learning period of time.

4 Figures and Tables

Cite this paper

@article{Hosokawa2012ARA, title={A reward allocation method for reinforcement learning in stabilizing control of T-inverted pendulum}, author={Shu Hosokawa and Kazushi Nakano}, journal={2012 9th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology}, year={2012}, pages={1-4} }