Lucas Lehnert

Learn More
One question central to Reinforcement Learning is how to learn a feature representation that supports algorithm scaling and re-use of learned information from different tasks. Successor Features approach this problem by learning a feature representation that satisfies a temporal constraint. We present an implementation of an approach that decouples the(More)
Off-policy learning refers to the problem of learning the value function of a behaviour, or policy, while selecting actions with a different policy. Gradient-based off-policy learning algorithms, such as GTD (Sutton et al., 2009b) and TDC/GQ (Sutton et al., 2009a), converge when selecting actions with a fixed policy even when using function approximation(More)
Curiosity towards exploring new objects in one’s environment is a key driver of intelligent agents. We explore the problem of mapping in environments which are non-stationary, and where areas may exhibit different change patterns. This is an important challenge for potential “domestic” robots, which would have to perform tasks in houses. We propose a(More)
  • 1