Sample Complexity Bounds of Exploration

@inproceedings{Li2011SampleCB,
  title={Sample Complexity Bounds of Exploration},
  author={Lihong Li},
  year={2011}
}
Efficient exploration is widely recognized as a fundamental challenge inherent in reinforcement learning. Algorithms that explore efficiently converge faster to near-optimal policies. While heuristics techniques are popular in practice, they lack formal guarantees and may not work well in general. This chapter studies algorithms with polynomial sample complexity of exploration, both model-based and model-free ones, in a unified manner. These so-called PAC-MDP algorithms behave near-optimally… CONTINUE READING

References

Publications referenced by this paper.
Showing 1-10 of 71 references

A unifying framework for computational reinforcement learning theory

  • L Li
  • PhD thesis,
  • 2009
Highly Influential
7 Excerpts

Optimal learning: Computational procedures for Bayes-adaptive Markov decision processes

  • MO Duff
  • PhD thesis,
  • 2002
Highly Influential
3 Excerpts

Similar Papers

Loading similar papers…