Corpus ID: 12922309

Reinforcement Learning in POMDP's via Direct Gradient Ascent

@inproceedings{Baxter2000ReinforcementLI,
  title={Reinforcement Learning in POMDP's via Direct Gradient Ascent},
  author={Jonathan Baxter and P. Bartlett},
  booktitle={ICML},
  year={2000}
}
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled POMDPs. We introduce GPOMDP, a REINFORCE-like algorithm for estimating an approximation to the gradient of the average reward as a function of the parameters of a stochastic policy. The algorithm’s chief advantages are that it requires only a single sample path of the underlying Markov chain, it uses only one free parameter 2 [0; 1), which has a… Expand
107 Citations
Parameter-exploring policy gradients
  • 189
  • PDF
Experiments with Infinite-Horizon, Policy-Gradient Estimation
  • 173
  • PDF
Model-based Policy Gradient Reinforcement Learning
  • 17
  • PDF
Exploration in Gradient-Based Reinforcement Learning
  • 40
Multimodal Parameter-exploring Policy Gradients
  • 14
  • PDF
State-Dependent Exploration for Policy Gradient Methods
  • 82
  • PDF
Mixing-Time Regularized Policy Gradient
  • 5
  • PDF
Learning from Scarce Experience
  • 66
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
Direct Gradient-Based Reinforcement Learning: I. Gradient Estimation Algorithms
  • 73
  • PDF
Estimation and Approximation Bounds for Gradient-Based Reinforcement Learning
  • 39
  • PDF
Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems
  • 382
  • PDF
Reinforcement Learning From State and Temporal Differences
  • 13
Gradient Descent for General Reinforcement Learning
  • 275
  • Highly Influential
  • PDF
Learning Without State-Estimation in Partially Observable Markovian Decision Processes
  • 373
  • PDF
Reinforcement Learning with Soft State Aggregation
  • 314
  • PDF
Simulation-based optimization of Markov reward processes
  • 190
  • PDF
...
1
2
3
4
...