A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index

@article{Kallenberg1986ANO,
  title={A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index},
  author={Lodewijk C. M. Kallenberg},
  journal={Math. Oper. Res.},
  year={1986},
  volume={11},
  pages={184-186}
}
  • L. Kallenberg
  • Published 1 February 1986
  • Mathematics
  • Math. Oper. Res.
In a recent paper Katehakis and Chen propose a sequence of linear programs for the computation of the Gittins indices. If there are N projects and project v has Kv states, then Σv=1NKv linear programs have to be solved. In this note it is shown that instead of the Kv linear programs for project v also one parametric linear program with the same dimensions can be solved. 

A (2/3)n3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain

A new fast-pivoting algorithm is presented that computes the n Gittins index values of an n-state bandit by performing 2/3 n3 +O n2 arithmetic operations, thus attaining better complexity than previous algorithms and matching that of solving a corresponding linearequation system by Gaussian elimination.

A (2/3)n3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain

A new fast-pivoting algorithm is presented that computes the n Gittins index values of an n-state bandit by performing (2/3)n3 + O(n2) arithmetic operations, thus attaining better complexity than previous algorithms and matching that of solving a corresponding linear-equation system by Gaussian elimination.

A bisection/successive approximation method for computing Gittins indices

An iterative method, combining bisections and successive approximations, is proposed for computing intervals containing the Gittins indices, which in many applications is sufficient.

The Multi-Armed Bandit Problem: Decomposition and Computation

It is shown that an approximate largest-index rule yields an approximately optimal policy for the N-project problem, and more efficient methods of computing the indices on-line and/or for sparse transition matrices in large state spaces than have been suggested heretofore.

Concentration of Measure

In mathematics, concentration of measure (e.g. about a median) is a principle that is applied in measure theory, probability and combinatorics, and has consequences for other fields such as Banach

A Fast-Pivoting Algorithm for Whittle’s Restless Bandit Index

A new fast-pivoting algorithm is obtained that computes the n Whittle index values of an n-state restless bandit by performing, after an initialization stage, n steps that entail (2/3)n3+O(n2) arithmetic operations.

Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications

  • L. Kallenberg
  • Computer Science, Mathematics
    Math. Methods Oper. Res.
  • 1994
This paper deals with some applications of Markov decision models for which the linear programming method is efficient, including replacement models, separable models and the multi-armed bandit model.

Concentration of Measure

Concentration of measure plays a central role in the content of this book. This chapter gives the first account of this subject. Bernstein-type concentration inequalities are often used to

Optimal Design for Least Squares Estimators

  • Mathematics
  • 2020
In the preceeding chapters we introduced the linear bandit and showed how to construct confidence intervals for least squares estimators. We now study the problem of choosing actions for which these

Combinatorial Bandits

...

References

SHOWING 1-6 OF 6 REFERENCES

Mathematical Programming Methods for Logistics Planning.

Abstract : This project was concerned with the application of mathematical programming models and techniques to logistics planning problems. Basic research was performed on a new approach, called

Optimization Over Time

Transient policies in discrete dynamic programming: Linear programming including suboptimality tests and additional constraints

This paper investigates the computation of transient-optimal policies in discrete dynamic programming and the concept of superharmonicity is introduced, which provides the linear program to compute the transientvalue-vector and a transient- optimal policy.

Linear Programming for Finite State Multi-Armed Bandit Problems

It is shown that when the state space is finite the computation of the dynamic allocation indices can be handled by linear programming methods.

Introduction to Stochastic Dynamic Programming

  • 1983