Associative reinforcement learning: A generate and test algorithm

@article{Kaelbling2004AssociativeRL,
  title={Associative reinforcement learning: A generate and test algorithm},
  author={Leslie Pack Kaelbling},
  journal={Machine Learning},
  year={2004},
  volume={15},
  pages={299-319}
}
An agent that must learn to act in the world by trial and error faces thereinforcement learning problem, which is quite different from standard concept learning. Although good algorithms exist for this problem in the general case, they are often quite inefficient and do not exhibit generalization. One strategy is to find restricted classes of action policies that can be learned more efficiently. This paper pursues that strategy by developing an algorithm that performans an on-line search… 

Associative reinforcement learning: Functions ink-DNF

TLDR
Algorithms are developed that can efficiently learn action maps that are expressible ink-DNF and are shown to have very good performance.

Associative Reinforcement Learning: Functions in k-DNF

TLDR
Algorithms that can efficiently learn action maps that are expressible in k-DNF are developed and are shown to have very good performance.

Reinforcement Learning with Immediate Rewards and Linear Hypotheses

TLDR
For two cases, one in which a continuous-valued reward is given by applying the unknown linear function, and another in which the probability of receiving the larger of binary-valued rewards is obtained, lower bounds are provided that show that the rate of convergence is nearly optimal.

Decision making using Thompson Sampling

TLDR
This thesis argues that both the multi-armed bandit problem and the best arm identification problem can be tackled effectively using Thompson Sampling based approaches and provides empirical evidence to support this claim.

A version of Geiringer-like theorem for decision making in the environments with randomness and incomplete information

TLDR
A version of a theorem that originated from population genetics and has been later adopted in evolutionary computation theory that will lead to novel Monte‐Carlo sampling algorithms that provably increase the AI potential is established.

A confidence metric for using neurobiological feedback in actor-critic reinforcement learning based brain-machine interfaces

TLDR
An adaptive BMI that could handle inaccuracies in the critic feedback in an effort to produce more accurate RL based BMIs is developed and the potential application of the technique in developing an autonomous BMI that does not need an external signal for training or extensive calibration is suggested.

Using Confidence Bounds for Exploitation-Exploration Trade-offs

  • P. Auer
  • Computer Science
    J. Mach. Learn. Res.
  • 2002
TLDR
It is shown how a standard tool from statistics, namely confidence bounds, can be used to elegantly deal with situations which exhibit an exploitation-exploration trade-off, and improves the regret from O(T3/4) to T1/2.

Computational mechanisms of curiosity and goal-directed exploration

TLDR
This work illustrates the emergence of different types of information-gain, termed active inference and active learning, and shows how these forms of exploration induce distinct patterns of ‘Bayes-optimal’ behaviour.

Conceptual Commitments of the LIDA Model of Cognition

TLDR
The intention is to initiate a discussion among AGI researchers about which conceptual commitments are essential, or particularly useful, toward creating AGI agents, and to describe the hypotheses underlying one such model, the Learning Intelligent Distribution Agent (LIDA) Model.

A novel heuristic Q-learning algorithm for solving stochastic games

  • Jianwei LiWeiyi Liu
  • Computer Science
    2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence)
  • 2008
TLDR
The experimentation shows that the proposed Multi-agent Heuristic Q-Learning (MHQL) method can drastically decrease inefficient and repetitive learning thus speed up convergence than iterative Q-learning.

References

SHOWING 1-9 OF 9 REFERENCES

Associative Reinforcement Learning: Functions in k-DNF

TLDR
Algorithms that can efficiently learn action maps that are expressible in k-DNF are developed and are shown to have very good performance.

Learning in embedded systems

TLDR
This dissertation addresses the problem of designing algorithms for learning in embedded systems using Sutton's techniques for linear association and reinforcement comparison, while the interval estimation algorithm uses the statistical notion of confidence intervals to guide its generation of actions.

Predicting the effect of instance representations on inductive learning

TLDR
This dissertation describes how the duality between finding a compact description for the examples and generalizing from the examples can be utilized to determine the suitability of an instance representation for a learning algorithm.

Incremental learning from noisy data

TLDR
This paper first reviews a framework for discussing machine learning systems and then describes STAGGER in that framework, which is based on a distributed concept description which is composed of a set of weighted, symbolic characterizations.

Concept acquisition through representational adjustment

TLDR
This thesis promotes the hypothesis that the necessary abstractions can be learned and presents a model that relies on a weighted, symbolic description of concepts that should scale-up to larger tasks than those studied and have a number of potential applications.

Subjective bayesian methods for rule-based inference systems

TLDR
A subjective Bayesian inference method that realizes some of the advantages of both formal and informal approaches, and modifications needed to deal with the inconsistencies usually found in collections of subjective statements are described.

Learning in Embedded Systems. Cambridge, Massachusetts: The MIT Press

  • Also available as a PhD Thesis from Stanford University,
  • 1993

Associative reinforcement learn

  • 1993