Corpus ID: 231846402

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

@article{Dong2021ProvableMN,
  title={Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature},
  author={Kefan Dong and Jia-Qi Yang and Tengyu Ma},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.04168}
}
  • Kefan Dong, Jia-Qi Yang, Tengyu Ma
  • Published 2021
  • Computer Science
  • ArXiv
  • This paper studies model-based bandit and reinforcement learning (RL) with nonlinear function approximations. We propose to study convergence to approximate local maxima because we show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward. For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL), which provably converges to a local maximum with sample complexity… CONTINUE READING

    References

    SHOWING 1-10 OF 58 REFERENCES
    Algorithmic Framework for Model-based Reinforcement Learning with Theoretical Guarantees
    • 73
    • PDF
    Provably Efficient Reinforcement Learning with General Value Function Approximation
    • 18
    • Highly Influential
    Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches
    • 53
    • PDF
    Neural Contextual Bandits with UCB-based Exploration
    • 12
    • PDF
    Provably Efficient Reinforcement Learning with Linear Function Approximation
    • 102
    • PDF
    Reinforcement Leaning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
    • 74
    • PDF
    Is Q-learning Provably Efficient?
    • 221
    • Highly Influential
    • PDF
    Optimism in Reinforcement Learning with Generalized Linear Function Approximation
    • 28
    • PDF
    Learning Near Optimal Policies with Low Inherent Bellman Error
    • 32
    • PDF
    Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
    • 21
    • PDF