Reducing reinforcement learning to KWIK online regression

  title={Reducing reinforcement learning to KWIK online regression},
  author={Lihong Li and Michael L. Littman},
  journal={Annals of Mathematics and Artificial Intelligence},
One of the key problems in reinforcement learning (RL) is balancing exploration and exploitation. Another is learning and acting in large Markov decision processes (MDPs) where compact function approximation has to be used. This paper introduces REKWIRE, a provably efficient, model-free algorithm for finite-horizon RL problems with value function approximation (VFA) that addresses the exploration-exploitation tradeoff in a principled way. The crucial element of this algorithm is a reduction of… CONTINUE READING
Highly Cited
This paper has 22 citations. REVIEW CITATIONS