Corpus ID: 216036114

Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition

@article{Zhang2020AlmostOM,
  title={Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition},
  author={Zihan Zhang and Y. Zhou and Xiangyang Ji},
  journal={ArXiv},
  year={2020},
  volume={abs/2004.10019}
}
We study the reinforcement learning problem in the setting of finite-horizon episodic Markov Decision Processes (MDPs) with $S$ states, $A$ actions, and episode length $H$. We propose a model-free algorithm UCB-Advantage and prove that it achieves $\tilde{O}(\sqrt{H^2SAT})$ regret where $T = KH$ and $K$ is the number of episodes to play. Our regret bound improves upon the results of [Jin et al., 2018] and matches the best known model-based algorithms as well as the information theoretic lower… Expand
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon
Model-Based Reinforcement Learning with Value-Targeted Regression
Minimax Optimal Reinforcement Learning for Discounted MDPs
Q-learning with Logarithmic Regret
Near-Optimal Regret Bounds for Model-Free RL in Non-Stationary Episodic MDPs
Provably Efficient Reinforcement Learning with General Value Function Approximation
Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes
Provably Efficient Online Agnostic Learning in Markov Games
An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap
...
1
2
3
...

References

SHOWING 1-10 OF 29 REFERENCES
Variance Reduction Methods for Sublinear Reinforcement Learning
Is Q-learning Provably Efficient?
Regret Minimization for Reinforcement Learning by Evaluating the Optimal Bias Function
Provably Efficient Q-Learning with Low Switching Cost
Optimistic posterior sampling for reinforcement learning: worst-case regret bounds
Minimax Regret Bounds for Reinforcement Learning
PAC model-free reinforcement learning
Near-optimal Regret Bounds for Reinforcement Learning
...
1
2
3
...