Reinforcement Learning with Perceptual Aliasing :

Abstract

The Perceptual Distinctions Approach Lonnie Chrisman School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 chrisman@cs.cmu.edu Abstract It is known that Perceptual Aliasing may signi cantly diminish the e ectiveness of reinforcement learning algorithms [Whitehead and Ballard, 1991]. Perceptual aliasing occurs when multiple situations that are indistinguishable from immediate perceptual input require di erent responses from the system. For example, if a robot can only see forward, yet the presence of a battery charger behind it determines whether or not it should backup, immediate perception alone is insu cient for determining the most appropriate action. It is problematic since reinforcement algorithms typically learn a control policy from immediate perceptual input to the optimal choice of action. This paper introduces the predictive distinctions approach to compensate for perceptual aliasing caused from incomplete perception of the world. An additional component, a predictive model, is utilized to track aspects of the world that may not be visible at all times. In addition to the control policy, the model must also be learned, and to allow for stochastic actions and noisy perception, a probabilistic model is learned from experience. In the process, the system must discover, on its own, the important distinctions in the world. Experimental results are given for a simple simulated domain, and additional issues are discussed. Introduction Reinforcement learning techniques have recently received a lot of interest due to their potential application to the problem of learning situated behaviors for robotic tasks ([Sutton, 1990], [Lin, 1991], [Mahadevan and Connell, 1991], [Mill an and Torras, 1991], [Chapman and Kaelbling, 1991]). The objective for a reinforcement learning agent is to acquire a policy for choosing actions so as to maximize overall performance. After each action, the environment provides feedback in the form of a scalar reinforcement value, and the discounted cumulative reinforcement is customarily used to assess overall performance. Predictive Model Reinforcement Learner

Extracted Key Phrases

Statistics

051015'94'97'00'03'06'09'12'15
Citations per Year

85 Citations

Semantic Scholar estimates that this publication has 85 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Chrisman1992ReinforcementLW, title={Reinforcement Learning with Perceptual Aliasing :}, author={Lonnie Chrisman}, year={1992} }