# Learning to Drive a Bicycle Using Reinforcement Learning and Shaping

@inproceedings{Randlv1998LearningTD, title={Learning to Drive a Bicycle Using Reinforcement Learning and Shaping}, author={Jette Randl{\o}v and Preben Alstr{\o}m}, booktitle={ICML}, year={1998} }

We present and solve a real-world problem of learning to drive a bicycle. [...] Key Method We solve the problem by online reinforcement learning using the Sarsa(A)-algorithm. Then we solve the composite problem of learning to balance a bicycle and then drive to 'It goal. In our approach the reinforcement function is independent of the task the agent tries to learn to solve. Expand

## Figures and Topics from this paper

## 334 Citations

Learning a Self-driving Bicycle Using Deep Deterministic Policy Gradient

- Computer Science2018 18th International Conference on Control, Automation and Systems (ICCAS)
- 2018

This paper improves the method for learning a bicycle which can itself balance and go to any specified locations by proposing a procedure which allows the controller to be gradually learned until it can stably balance and lead the bicycle to anyspecified places.

Controlling bicycle using deep deterministic policy gradient algorithm

- Engineering, Computer Science2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI)
- 2017

This study focuses on applying a state-of-the-art deep reinforcement learning algorithm called Deep Deterministic Policy Gradient to control the bicycle.

Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms

- Mathematics, Computer ScienceSymmetry
- 2019

This paper uses a reward function and a deep neural network to build a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm.

Reinforcement learning for bicycle control

- Computer Science
- 2013

To extend Randlov and Alstrom's work on shaping, this work implemented their original work using the PyBrain machine learning library and tried their own suite of complex reward functions that would work well for arbitrary goal destinations.

Reinforcement Learning Model with a Reward Function Based on Human Driving Characteristics

- Computer Science2019 15th International Conference on Computational Intelligence and Security (CIS)
- 2019

A comparison of the proposed RL model with human drivers shows that the trained agent can follow the preceding vehicle smoothly and safely.

Learning bicycle stunts

- Computer ScienceACM Trans. Graph.
- 2014

This work presents a general approach for simulating and controlling a human character that is riding a bicycle and uses Neuroevolution of Augmenting Topology (NEAT) to optimize both the parametrization and the parameters of the policies.

Reinforcement-Driven Shaping of Sequence Learning in Neural Dynamics

- Computer ScienceSAB
- 2014

A recent framework for integrating reinforcement learning and dynamic neural fields is extended, by using the principle of shaping, in order to reduce the search space of the learning agent.

Learning Macro-Actions in Reinforcement Learning

- Computer ScienceNIPS
- 1998

A method for automatically constructing macro-actions from scratch from primitive actions during the reinforcement learning process to reinforce the tendency to perform action b after action a if such a pattern of actions has been rewarded.

The Challenges of Reinforcement Learning in Robotics and Optimal Control

- Computer ScienceAISI
- 2016

This paper discusses a widely used RL algorithm called Q-learning, which can adapted to work in continuous states and action spaces, the methods for computing rewards which generates an adaptive optimal controller and accelerate learning process and finally the safe exploration approaches.

A phased reinforcement learning algorithm for complex control problems

- Computer ScienceArtificial Life and Robotics
- 2007

The key element of the proposed algorithm is a shaping function defined on a novel position–direction space that is autonomously constructed once the goal is reached, and constrains the exploration area to optimize the policy.

## References

SHOWING 1-10 OF 37 REFERENCES

Reward Functions for Accelerated Learning

- Computer ScienceICML
- 1994

A methodology for designing reinforcement functions that take advantage of implicit domain knowledge in order to accelerate learning in situated domains characterized by multiple goals, noisy state, and inconsistent reinforcement is proposed.

Training and Tracking in Robotics

- Computer ScienceIJCAI
- 1985

The learning system's ability to adapt to changes and to profit from a selected training sequence are explored, both of which are of obvious utility in practical robotics applications.

Reinforcement learning and its application to control

- Computer Science
- 1992

It is argued that for certain types of problems the latter approach, of which reinforcement learning is an example, can yield faster, more reliable learning, while the former approach is relatively inefficient.

Robot shaping: The Hamster Experiment

- Engineering
- 1996

In this paper we present an example of the application of a technique, which we call robot shaping, to designing and building learning autonomous robots. Our autonomous robot (called HAMSTER1) is a…

Robot Shaping: Developing Autonomous Agents Through Learning

- Computer ScienceArtif. Intell.
- 1994

This paper connects both simulated and real robots to Alecsys, a parallel implementation of a learning classifier system with an extended genetic algorithm to demonstrate that classifier systems with genetic algorithms can be practically employed to develop autonomous agents.

Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces

- Computer ScienceAdapt. Behav.
- 1997

This article proposes a simple and modular technique that can be used to implement function approximators with nonuniform degrees of resolution so that the value function can be represented with higher accuracy in important regions of the state and action spaces.

Introduction to Reinforcement Learning

- Computer Science
- 1998

In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.

Roles of Macro-Actions in Accelerating Reinforcement Learning

- Computer Science
- 1997

Although eligibility traces increased the rate of convergence to the optimal value function compared to learning with macro-actions but without eligibility traces, eligibility traces did not permit the optimal policy to be learned as quickly as it was using macro- actions.

Problem solving with reinforcement learning

- Computer Science
- 1995

This thesis is concerned with practical issues surrounding the application of reinforcement learning techniques to tasks that take place in high dimensional continuous state-space environments. In…

Temporal Difference Learning and TD-Gammon

- Computer ScienceJ. Int. Comput. Games Assoc.
- 1995

TD-GAMMON is a neural network that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome.