Share This Author
Planning and Acting in Partially Observable Stochastic Domains
Reinforcement Learning: A Survey
Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Learning Policies for Partially Observable Environments: Scaling Up
Acting Optimally in Partially Observable Stochastic Domains
The existing algorithms for computing optimal control strategies for partially observable stochastic environments are found to be highly computationally inefficient and a new algorithm is developed that is empirically more efficient.
Exact and approximate algorithms for partially observable markov decision processes
This work looks at sequential decision making in environments where the actions have probabilistic outcomes and in which the system state is only partially observable and considers a number of approaches for deriving policies that yield sub-optimal control and empirically explore their performance on a range of problems.
Learning in embedded systems
- L. Kaelbling
- Computer Science
- 20 May 1993
This dissertation addresses the problem of designing algorithms for learning in embedded systems using Sutton's techniques for linear association and reinforcement comparison, while the interval estimation algorithm uses the statistical notion of confidence intervals to guide its generation of actions.
Acting under uncertainty: discrete Bayesian models for mobile-robot navigation
- A. Cassandra, L. Kaelbling, J. Kurien
- BusinessProceedings of IEEE/RSJ International Conference…
- 1 May 1996
The optimal solution to the problem of how actions should be chosen is presented, formulated as a partially observable Markov decision process, which goes on to explore a variety of heuristic control strategies.
Learning to Cooperate via Policy Search
This paper provides a gradient-based distributed policy-search method for cooperative games and compares the notion of local optimum to that of Nash equilibrium, and demonstrates the effectiveness of this method experimentally in a small, partially observable simulated soccer domain.
Lifted Probabilistic Inference with Counting Formulas
- Brian Milch, Luke Zettlemoyer, K. Kersting, Michael Haimes, L. Kaelbling
- Computer ScienceAAAI
- 13 July 2008
This paper presents a new lifted inference algorithm, C-FOVE, that not only handles counting formulas in its input, but also creates counting formulas for use in intermediate potentials, and achieves asymptotic speed improvements compared to FOVE.
Learning to Achieve Goals
- L. Kaelbling
- Computer ScienceIJCAI
The DG-learning algorithm is presented, which learns eeciently to achieve dynamically changing goals and exhibits good knowledge transfer between goals and shows the superiority of DG learning over Q learning in a moderately large, synthetic, non-deterministic domain.