Linear Regression and Its Application to Model-Based Reinforcement Learning

We provide a provably efficient algorithm for learning Marko v Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a model-based approach and show that a special type of online linear regression allows us to learn MDPs with (possibly kernalize d) linearly parameterized dynamics. This result builds… CONTINUE READING