Reinforcement Learning with Immediate Rewards and Linear Hypotheses

  title={Reinforcement Learning with Immediate Rewards and Linear Hypotheses},
  author={Naoki Abe and Alan W. Biermann and Philip M. Long},
We consider the design and analysis of algorithms that learn from the consequences of their actions with the goal of maximizing their cumulative reward, when the consequence of a given action is felt immediately, and a linear function, which is unknown a priori, (approximately) relates a feature vector for each action/state pair to the (expected) associated reward. We focus on two cases, one in which a continuous-valued reward is (approximately) given by applying the unknown linear function… CONTINUE READING
Highly Cited
This paper has 45 citations. REVIEW CITATIONS