Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor

@article{Hansen2011StrategyII,
  title={Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor},
  author={Thomas Dueholm Hansen and Peter Bro Miltersen and Uri Zwick},
  journal={J. ACM},
  year={2011},
  volume={60},
  pages={1:1-1:16}
}
Ye [2011] showed recently that the simplex method with Dantzig’s pivoting rule, as well as Howard’s <i>policy iteration</i> algorithm, solve discounted Markov decision processes (MDPs), with a constant discount factor, in strongly polynomial time. More precisely, Ye showed that both algorithms terminate after at most <i>O</i>(<i>mn</i>1−<i>γ</i> log <i>n</i>1−<i>γ</i>) iterations, where <i>n</i> is the number of states, <i>m</i> is the total number of actions in the MDP, and 0 < <i>γ</i> < 1 is… CONTINUE READING
Highly Cited
This paper has 58 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Citations

Publications citing this paper.

58 Citations

051015'11'13'15'17'19
Citations per Year
Semantic Scholar estimates that this publication has 58 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 11 references

An Exponential Lower Bound for the Parity Game Strategy Improvement Algorithm as We Know it

2009 24th Annual IEEE Symposium on Logic In Computer Science • 2009
View 5 Excerpts
Highly Influenced

Algorithms for sequential decision making

M. L. Littman
PhD thesis, Brown University, Department of Computer Science, • 1996
View 5 Excerpts
Highly Influenced

A polynomial time bound for Howard’s policy improvement algorithm

U. Meister, U. Holzbaur
OR Spektrum, 8:37–40, • 1986
View 4 Excerpts
Highly Influenced

Dynamic programming.

Science • 1966
View 5 Excerpts
Highly Influenced

Stochastic Games.

Proceedings of the National Academy of Sciences of the United States of America • 1953
View 10 Excerpts
Highly Influenced

Linear programming, the simplex algorithm and simple polytopes

Math. Program. • 1997
View 4 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…