• Publications
  • Influence
Planning and Acting in Partially Observable Stochastic Domains
In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. Expand
  • 3,654
  • 429
  • PDF
Reinforcement Learning: A Survey
This paper surveys the field of reinforcement learning from a computer-science perspective. Expand
  • 6,768
  • 361
  • PDF
Learning Policies for Partially Observable Environments: Scaling Up
We show that a combination of two novel approaches performs well on these problems and suggest methods for scaling to even larger and more complicated domains. Expand
  • 700
  • 81
Exact and approximate algorithms for partially observable markov decision processes
Automated sequential decision making is crucial in many contexts. In the face of uncertainty, this task becomes even more important, though at the same time, computing optimal decision policiesExpand
  • 427
  • 58
Acting Optimally in Partially Observable Stochastic Domains
We describe the partially observable Markov decision process (POMDP) approach to finding optimal or near-optimal control strategies for partially observable stochastic environments, given a complete model of the environment. Expand
  • 680
  • 56
  • PDF
Learning in embedded systems
A number of algorithms for learning action strategies from reinforcement values are presented and compared empirically with existing reinforcement-learning algorithms. Expand
  • 732
  • 49
Acting under uncertainty: discrete Bayesian models for mobile-robot navigation
This paper presents the optimal solution to the problem, formulated as a partially observable Markov decision process. Expand
  • 565
  • 35
Lifted Probabilistic Inference with Counting Formulas
We present a new lifted inference algorithm, C-FOVE, that not only handles counting formulas in its input, but also creates counting formulas for use in intermediate potentials. Expand
  • 209
  • 29
  • PDF
Learning to Cooperate via Policy Search
We provide a gradient-based distributed policy-search method for cooperative multi-agent domains and compare the notion of local optimum to that of Nash equilibrium. Expand
  • 295
  • 27
  • PDF
Learning Symbolic Models of Stochastic Domains
We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Expand
  • 197
  • 27
  • PDF