Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes


In clinical practice, physicians make a series of treatment decisions over the course of a patient's disease based on his/her baseline and evolving characteristics. A dynamic treatment regime is a set of sequential decision rules that operationalizes this process. Each rule corresponds to a decision point and dictates the next treatment action based on the accrued information. Using existing data, a key goal is estimating the optimal regime, that, if followed by the patient population, would yield the most favorable outcome on average. Q- and A-learning are two main approaches for this purpose. We provide a detailed account of these methods, study their performance, and illustrate them using data from a depression study.

DOI: 10.1214/13-STS450

Extracted Key Phrases

8 Figures and Tables

Citations per Year

Citation Velocity: 9

Averaging 9 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@article{Schulte2014QAA, title={Q- and A-learning Methods for Estimating Optimal Dynamic Treatment Regimes}, author={Phillip J. Schulte and Anastasios A. Tsiatis and Eric B. Laber and Marie Davidian}, journal={Statistical science : a review journal of the Institute of Mathematical Statistics}, year={2014}, volume={29 4}, pages={640-661} }