Reinforcement learning is a type of machine learning methods that does not require a detailed teaching signal by a human, which is expected to be applied to real robots. In its application to real robots, the learning processes are required to be finished in a short learning period of time. A reinforcement learning method of non-bootstrap type has fast convergence speeds in the tasks such as Sutton's maze problem that aims to reach a target state in a minimum time. However, this method is difficult to learn a task of keeping a stable state as long as possible. This paper improves a reward allocation method for stabilizing control tasks. The validity of our method is demonstrated through simulation for stabilizing control of T-inverted pendulum. Our proposed method can acquire a policy of keeping a stable state within a short learning period of time.