Corpus ID: 232146750

Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems

@article{Mate2021EfficientAF,
  title={Efficient Algorithms for Finite Horizon and Streaming Restless Multi-Armed Bandit Problems},
  author={Aditya Mate and Arpita Biswas and Christoph Siebenbrunner and Milind Tambe},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.04730}
}
Restless Multi-Armed Bandits (RMABs) have been popularly used to model limited resource allocation problems. Recently, these have been employed for health monitoring and intervention planning problems. However, the existing approaches fail to account for the arrival of new patients and the departure of enrolled patients from a treatment program. To address this challenge, we formulate a streaming bandit (S-RMAB) framework, a generalization of RMABs where heterogeneous arms arrive and leave… Expand
1 Citations

Figures from this paper

AI for Planning Public Health Interventions
TLDR
This dissertation casts this as a Restless Multi-Armed Bandit (RMAB) planning problem, identifying and addressing several new, fundamental questions in RMABs. Expand

References

SHOWING 1-10 OF 29 REFERENCES
Faster Dynamic Matrix Inverse for Faster LPs
TLDR
This data structure is based on a recursive application of the Woodbury-Morrison identity for implementing low-rank updates, combined with recent sketching technology, and leads to the fastest known LP solver for general (dense) linear programs. Expand
Restless bandits with controlled restarts: Indexability and computation of Whittle index
TLDR
This work presents detailed numerical experiments which suggest that Whittle index policy performs close to the optimal policy and performs significantly better than myopic policy, which is a commonly used heuristic. Expand
Restless Poachers: Handling Exploration-Exploitation Tradeoffs in Security Domains
TLDR
This paper forms the problem as a restless multi-armed bandit (RMAB) model, providing two sufficient conditions for indexability and an algorithm to numerically evaluate indexability, and proposes a binary search based algorithm to find Whittle index policy efficiently. Expand
Indexability of Restless Bandit Problems and Optimality of Whittle Index for Dynamic Multichannel Access
  • K. Liu, Qing Zhao
  • Computer Science, Mathematics
  • IEEE Transactions on Information Theory
  • 2010
TLDR
This work establishes the indexability and obviates the need to know the Markov transition probabilities in Whittle index policy, and develops efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. Expand
A Proof for the Queuing Formula: L = λW
In a queuing process, let 1/λ be the mean time between the arrivals of two consecutive units, L be the mean number of units in the system, and W be the mean time spent by a unit in the system. It isExpand
Collapsing Bandits and Their Application to Public Health Interventions
TLDR
A new restless multi-armed bandit setting in which each arm follows a binary-state Markovian process with a special structure, "collapsing" any uncertainty, but when an arm is passive, no observation is made, thus allowing uncertainty to evolve. Expand
Whittle Index for AoI-Aware Scheduling
TLDR
This work considers a system consisting of multiple sensors that send updates to a monitoring station via a shared communication channel and shows that Whittle Index based scheduling policies either outperform or match the performance of the best-known policy for all the settings studied. Expand
An asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action bandits
Abstract We propose an asymptotically optimal heuristic, which we term randomized assignment control (RAC) for a restless multi-armed bandit problem with discrete-time and finite states. It isExpand
Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data
TLDR
A deep learning model is constructed that can be used to proactively intervene with 21% more patients and before 76% more missed doses than current heuristic baselines, and performs 40% better than baseline methods for outcome prediction, allowing cities to target more resources to clinics with a heavier burden of patients at risk of failure. Expand
Optimal Screening for Hepatocellular Carcinoma: A Restless Bandit Model
This paper seeks an efficient way to screen a population of patients at risk for hepatocellular carcinoma when (1) each patient’s disease evolves stochastically and (2) there are limited screening ...
...
1
2
3
...