# Associative reinforcement learning: A generate and test algorithm

@article{Kaelbling2004AssociativeRL, title={Associative reinforcement learning: A generate and test algorithm}, author={Leslie Pack Kaelbling}, journal={Machine Learning}, year={2004}, volume={15}, pages={299-319} }

An agent that must learn to act in the world by trial and error faces thereinforcement learning problem, which is quite different from standard concept learning. Although good algorithms exist for this problem in the general case, they are often quite inefficient and do not exhibit generalization. One strategy is to find restricted classes of action policies that can be learned more efficiently. This paper pursues that strategy by developing an algorithm that performans an on-line search…

## 22 Citations

### Associative reinforcement learning: Functions ink-DNF

- Computer ScienceMachine Learning
- 2004

Algorithms are developed that can efficiently learn action maps that are expressible ink-DNF and are shown to have very good performance.

### Associative Reinforcement Learning: Functions in k-DNF

- Computer ScienceMachine Learning
- 2004

Algorithms that can efficiently learn action maps that are expressible in k-DNF are developed and are shown to have very good performance.

### Reinforcement Learning with Immediate Rewards and Linear Hypotheses

- Computer ScienceAlgorithmica
- 2003

For two cases, one in which a continuous-valued reward is given by applying the unknown linear function, and another in which the probability of receiving the larger of binary-valued rewards is obtained, lower bounds are provided that show that the rate of convergence is nearly optimal.

### Decision making using Thompson Sampling

- Computer Science, Economics
- 2014

This thesis argues that both the multi-armed bandit problem and the best arm identification problem can be tackled effectively using Thompson Sampling based approaches and provides empirical evidence to support this claim.

### A version of Geiringer-like theorem for decision making in the environments with randomness and incomplete information

- MathematicsInt. J. Intell. Comput. Cybern.
- 2012

A version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte‐Carlo sampling algorithms that provably increase the AI potential is established.

### A confidence metric for using neurobiological feedback in actor-critic reinforcement learning based brain-machine interfaces

- Computer ScienceFront. Neurosci.
- 2014

An adaptive BMI that could handle inaccuracies in the critic feedback in an effort to produce more accurate RL based BMIs is developed and the potential application of the technique in developing an autonomous BMI that does not need an external signal for training or extensive calibration is suggested.

### Using Confidence Bounds for Exploitation-Exploration Trade-offs

- Computer ScienceJ. Mach. Learn. Res.
- 2002

It is shown how a standard tool from statistics, namely confidence bounds, can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-off, and improves the regret from O(T3/4) to T1/2.

### Computational mechanisms of curiosity and goal-directed exploration

- PsychologybioRxiv
- 2018

This work illustrates the emergence of different types of information-gain, termed active inference and active learning, and shows how these forms of exploration induce distinct patterns of ‘Bayes-optimal’ behaviour.

### Conceptual Commitments of the LIDA Model of Cognition

- BiologyJ. Artif. Gen. Intell.
- 2013

The intention is to initiate a discussion among AGI researchers about which conceptual commitments are essential, or particularly useful, toward creating AGI agents, and to describe the hypotheses underlying one such model, the Learning Intelligent Distribution Agent (LIDA) Model.

### A novel heuristic Q-learning algorithm for solving stochastic games

- Computer Science2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
- 2008

The experimentation shows that the proposed Multi-agent Heuristic Q-Learning (MHQL) method can drastically decrease inefficient and repetitive learning thus speed up convergence than iterative Q-learning.

## References

SHOWING 1-9 OF 9 REFERENCES

### Associative Reinforcement Learning: Functions in k-DNF

- Computer ScienceMachine Learning
- 2004

Algorithms that can efficiently learn action maps that are expressible in k-DNF are developed and are shown to have very good performance.

### Learning in embedded systems

- Computer Science
- 1993

This dissertation addresses the problem of designing algorithms for learning in embedded systems using Sutton's techniques for linear association and reinforcement comparison, while the interval estimation algorithm uses the statistical notion of confidence intervals to guide its generation of actions.

### Predicting the effect of instance representations on inductive learning

- Computer Science
- 1992

This dissertation describes how the duality between finding a compact description for the examples and generalizing from the examples can be utilized to determine the suitability of an instance representation for a learning algorithm.

### Incremental learning from noisy data

- Computer ScienceMachine Learning
- 2004

This paper first reviews a framework for discussing machine learning systems and then describes STAGGER in that framework, which is based on a distributed concept description which is composed of a set of weighted, symbolic characterizations.

### Concept acquisition through representational adjustment

- Computer Science
- 1987

This thesis promotes the hypothesis that the necessary abstractions can be learned and presents a model that relies on a weighted, symbolic description of concepts that should scale-up to larger tasks than those studied and have a number of potential applications.

### Subjective bayesian methods for rule-based inference systems

- Computer ScienceAFIPS '76
- 1976

A subjective Bayesian inference method that realizes some of the advantages of both formal and informal approaches, and modifications needed to deal with the inconsistencies usually found in collections of subjective statements are described.

### Learning in Embedded Systems. Cambridge, Massachusetts: The MIT Press

- Also available as a PhD Thesis from Stanford University,
- 1993

### Associative reinforcement learn

- 1993