Associative reinforcement learning: A generate and test algorithm

  title={Associative reinforcement learning: A generate and test algorithm},
  author={Leslie Pack Kaelbling},
  journal={Machine Learning},
An agent that must learn to act in the world by trial and error faces thereinforcement learning problem, which is quite different from standard concept learning. Although good algorithms exist for this problem in the general case, they are often quite inefficient and do not exhibit generalization. One strategy is to find restricted classes of action policies that can be learned more efficiently. This paper pursues that strategy by developing an algorithm that performans an on-line search… 

Associative reinforcement learning: Functions ink-DNF

Algorithms are developed that can efficiently learn action maps that are expressible ink-DNF and are shown to have very good performance.

Associative Reinforcement Learning: Functions in k-DNF

Algorithms that can efficiently learn action maps that are expressible in k-DNF are developed and are shown to have very good performance.

Reinforcement Learning with Immediate Rewards and Linear Hypotheses

For two cases, one in which a continuous-valued reward is given by applying the unknown linear function, and another in which the probability of receiving the larger of binary-valued rewards is obtained, lower bounds are provided that show that the rate of convergence is nearly optimal.

Investigating Generate and Test for Online Representation Search with Softmax Outputs

This thesis investigates an online and incremental representation search algorithm called Generate and Test, which continually replaces the least useful features with newly generated features, and empirically shows that this new tester can improve representations better than the magnitude-based tester.

Decision making using Thompson Sampling

This thesis argues that both the multi-armed bandit problem and the best arm identification problem can be tackled effectively using Thompson Sampling based approaches and provides empirical evidence to support this claim.

A version of Geiringer-like theorem for decision making in the environments with randomness and incomplete information

A version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte‐Carlo sampling algorithms that provably increase the AI potential is established.

A confidence metric for using neurobiological feedback in actor-critic reinforcement learning based brain-machine interfaces

An adaptive BMI that could handle inaccuracies in the critic feedback in an effort to produce more accurate RL based BMIs is developed and the potential application of the technique in developing an autonomous BMI that does not need an external signal for training or extensive calibration is suggested.

Using Confidence Bounds for Exploitation-Exploration Trade-offs

  • P. Auer
  • Computer Science
    J. Mach. Learn. Res.
  • 2002
It is shown how a standard tool from statistics, namely confidence bounds, can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-off, and improves the regret from O(T3/4) to T1/2.

Computational mechanisms of curiosity and goal-directed exploration

This work illustrates the emergence of different types of information-gain, termed active inference and active learning, and shows how these forms of exploration induce distinct patterns of ‘Bayes-optimal’ behaviour.

Conceptual Commitments of the LIDA Model of Cognition

The intention is to initiate a discussion among AGI researchers about which conceptual commitments are essential, or particularly useful, toward creating AGI agents, and to describe the hypotheses underlying one such model, the Learning Intelligent Distribution Agent (LIDA) Model.



Associative Reinforcement Learning: Functions in k-DNF

Algorithms that can efficiently learn action maps that are expressible in k-DNF are developed and are shown to have very good performance.

Learning in embedded systems

This dissertation addresses the problem of designing algorithms for learning in embedded systems using Sutton's techniques for linear association and reinforcement comparison, while the interval estimation algorithm uses the statistical notion of confidence intervals to guide its generation of actions.

Predicting the effect of instance representations on inductive learning

This dissertation describes how the duality between finding a compact description for the examples and generalizing from the examples can be utilized to determine the suitability of an instance representation for a learning algorithm.

Incremental learning from noisy data

This paper first reviews a framework for discussing machine learning systems and then describes STAGGER in that framework, which is based on a distributed concept description which is composed of a set of weighted, symbolic characterizations.

Concept acquisition through representational adjustment

This thesis promotes the hypothesis that the necessary abstractions can be learned and presents a model that relies on a weighted, symbolic description of concepts that should scale-up to larger tasks than those studied and have a number of potential applications.

Subjective bayesian methods for rule-based inference systems

A subjective Bayesian inference method that realizes some of the advantages of both formal and informal approaches, and modifications needed to deal with the inconsistencies usually found in collections of subjective statements are described.

Learning in Embedded Systems. Cambridge, Massachusetts: The MIT Press

  • Also available as a PhD Thesis from Stanford University,
  • 1993

Associative reinforcement learn

  • 1993