Learn More
We consider the problem of learning high-performance Exploration/Exploitation (E/E) strategies for finite Markov Decision Processes (MDPs) when the MDP to be controlled is supposed to be drawn from a known probability distribution p M (·). The performance criterion is the sum of discounted rewards collected by the E/E strategy over an infinite length(More)
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise the collected rewards while interacting with their environment while using some prior knowledge that is accessed beforehand. Many BRL algorithms have already been proposed, but the benchmarks used to compare them are only relevant for specific cases. The paper addresses this(More)
Faculty of Applied Sciences Department of Electrical Engineering and Computer Science Doctor of Computer Science Offline Policy-search in Bayesian Reinforcement Learning by Michael CASTRONOVO This thesis presents research contributions in the study field of Bayesian Reinforcement Learning — a subfield of Reinforcement Learning where, even though the(More)
  • 1