Reinforcement learning via kernel temporal difference

Abstract

This paper introduces a kernel adaptive filter implemented with stochastic gradient on temporal differences, kernel Temporal Difference (TD)(λ), to estimate the state-action value function in reinforcement learning. The case λ=0 will be studied in this paper. Experimental results show the method's applicability for learning motor state… (More)
DOI: 10.1109/IEMBS.2011.6091370

Topics

4 Figures and Tables

Slides referencing similar topics