Online Learning with Sample Path Constraints

  title={Online Learning with Sample Path Constraints},
  author={Shie Mannor and John N. Tsitsiklis and Jia Yuan Yu},
  journal={Journal of Machine Learning Research},
We study online learning when the objective of the decision maker is to maximize her long-term average reward subject to certain sample path average constraints. We define the reward-in-hindsight as the highest reward the decision maker could have achieved, while satisfying the constraints, had she known Nature’s choices in advance. We show that in general the reward-in-hindsight is not attainable. The convex hull of the reward-in-hindsight function is, however, attainable. For the important… CONTINUE READING
Highly Cited
This paper has 35 citations. REVIEW CITATIONS
26 Citations
15 References
Similar Papers


Publications referenced by this paper.
Showing 1-10 of 15 references

Introduction to Linear Optimization

  • D. Bertsimas, J. N. Tsitsiklis
  • Athena Scientific,
  • 1997
Highly Influential
2 Excerpts

Stochastic games with average cost constraints

  • N. Shimkin
  • Advances in Dynamic Games and Applications,
  • 1994
Highly Influential
2 Excerpts

Controlled random walks

  • D. Blackwell
  • In Proc. Int. Congress of Mathematicians 1954,
  • 1956
Highly Influential
4 Excerpts

Online learning with prior information

  • E. Hazan, N. Megiddo
  • Proceedings of 20th Annual Conference on Learning…
  • 2007
2 Excerpts

Similar Papers

Loading similar papers…