Author pages are created from data sourced from our academic publisher partnerships and public sources.
Reinforcement Learning in POMDPs Without Resets
We present new algorithms for learning in POMDPs which guarantee that the agent will obtain the optimal average reward in the limit. Expand
Planning in POMDPs Using Multiplicity Automata
We show that POMDPs can be represented by multiplicity automata with no increase in the representation size. Expand
On-line Markov Decision Processes
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts… Expand