Relating reinforcement learning performance to classification performance


We prove a quantitative connection between the expected sum of rewards of a policy and binary classification performance on created subproblems. This connection holds without any unobservable assumptions (no assumption of independence, small mixing time, fully observable states, or even hidden states) and the resulting statement is independent of the number… (More)
DOI: 10.1145/1102351.1102411


Figures and Tables

Sorry, we couldn't extract any figures or tables for this paper.


Citations per Year

56 Citations

Semantic Scholar estimates that this publication has 56 citations based on the available data.

See our FAQ for additional information.

Slides referencing similar topics