Behavior coordination for a mobile robot using modular reinforcement learning

  title={Behavior coordination for a mobile robot using modular reinforcement learning},
  author={Eiji Uchibe and Minoru Asada and Koh Hosoda},
  journal={Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96},
  pages={1329-1336 vol.3}
  • E. Uchibe, M. Asada, K. Hosoda
  • Published 1996
  • Engineering, Computer Science
  • Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96
Coordination of multiple behaviors independently obtained by a reinforcement learning method is one of the issues in order for the method to be scaled to larger and more complex robot learning tasks. Direct combination of all the state spaces for individual modules (subtasks) needs enormous learning time, and it causes hidden states. This paper presents a method of modular learning which coordinates multiple behaviors taking account of a trade-off between learning time and performance. First… Expand
Vision Based State Space Construction for Learning Mobile Robots in Multi-agent Environments
A method which estimates the relationship between the learner's behaviors and the other agents' ones in the environment through interactions using the method of system identification to construct a state space in such an environment is proposed. Expand
Cooperative behavior acquisition by learning and evolution in a multi-agent environment for mobile robots
This dissertation proposes a method that acquires the purposive behaviors based on the estimation of the state vectors and proposes a learning schedule in order to make learning stable especially in the early stage of multiagent systems. Expand
Module-Based Reinforcement Learning: Experiments with a Real Robot
It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed and supports the view that adaptive algorithms are advantageous to non-adaptive ones in complex environments. Expand
Cooperative Behavior Acquisition for Mobile Robots in Dynamically Changing Real Worlds Via Vision-Based Reinforcement Learning and Development
A vision-based reinforcement learning method that acquires cooperative behaviors in a dynamic environment using the robot soccer game initiated by RoboCup to illustrate the effectiveness of the proposed method. Expand
Module Based Reinforcement Learning: An Application to a Real Robot
A systematic design method whose motivation comes from the desire to transform the task-to-be-solved into a finite-state, discrete-time, “approximately” Markovian task, which is completely observable too is suggested. Expand
Module Based Reinforcement Learning for aReal
The behaviour of reinforcement learning (RL) algorithms is best understood in completely observable, nite state-and action-space, discrete-time controlled Markov-chains. Robot-learning domains, onExpand
Vision-based Behavior Learning and Development for Emergence of Robot Intelligence
This paper focuses on two issues on learning and development; a problem of state-action space construction, and a scaling-up problem. The former is mainly related to sensory-motor mapping and itsExpand
Cooperative Behavior Acquisition by Learning and Evolution of Vision-Motor Mapping for Mobile Robots
This paper proposes a number of learning and evolutionary methods contributed to realize cooperative behaviors among vision-based mobile robots in a dynamically changing environment. There are threeExpand
Strategy Classification in Multi-agent Environment — Applying Reinforcement Learning to Soccer Agents —
A method for agent behavior classification which estimates the relations between the learner’s behaviors and the other agents in the environment through interactions using the method of system identification and can cope with a rolling ball. Expand
Vision-based Learning and Development for Emergence of Robot Behaviors
A method which can cope with the complexity of multi-agent environment by a combination of a state vector estimation process and a reinforcement learning process based on the estimated vectors is introduced. Expand


Coordination of multiple behaviors acquired by a vision-based reinforcement learning
A method is proposed which accomplishes a whole task consisting of plural subtasks by coordinating multiple behaviors acquired by a vision-based reinforcement learning, and three kinds of coordinations of multiple behaviors are considered. Expand
Vision-based reinforcement learning for purposive behavior acquisition
A method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal is presented, and several issues in applying the reinforcement learning method to a real robot with vision sensor are discussed. Expand
Learning emergent tasks for an autonomous mobile robot
We present an implementation of a reinforcement learning algorithm through the use of a special neural network topology, the AHC (adaptive heuristic critic). The AHC is used as a fusion supervisor ofExpand
Learning Multiple Goal Behavior via Task Decomposition and Dynamic Policy Merging
This chapter considers the application of reinforcement learning to a simple class of dynamic multi-goal tasks and considers several merging strategies, from simple ones that compare and combine modular information about the current state only, to more sophisticated strategies that use lookahead search to construct more accurate utility estimates. Expand
Complexity and Cooperation in Q-Learning
This chapter describes two cooperative learning algorithms that can reduce search and decouple the learning rate from state-space size using the idea of a mentor who watches the learner and generates immediate rewards in response to its most recent actions. Expand
Rapid Task Learning for Real Robots
This chapter discusses how learning can be speeded up by exploiting properties of the task, sensor configuration, environment, and existing control structure. Expand
A Reinforcement Learning Method for Maximizing Undiscounted Rewards
A metric of undiscounted performance and an algorithm for finding action policies that maximize that measure are presented, which are modelled after the popular Q-learning algorithm. Expand
Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State
Utile Suffix Memory is a reinforcement learning algorithm that uses short-term memory to overcome the state aliasing that results from hidden state by combining the advantages of previous work in instance-based and memory-based learning. Expand
Learning from delayed rewards
  • B. Kröse
  • Computer Science
  • Robotics Auton. Syst.
  • 1995
The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals. This circuit is intended to make a choice between mono or stereo reproduction of signal A or ofExpand
A new look at the statistical model identification
The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined asExpand