Lucas Lehnert

We don’t have enough information about this author to calculate their statistics. If you think this is an error let us know.
Learn More
Off-policy learning refers to the problem of learning the value function of a behaviour, or policy, while selecting actions with a different policy. Gradient-based off-policy learning algorithms, such as GTD (Sutton et al., 2009b) and TDC/GQ (Sutton et al., 2009a), converge when selecting actions with a fixed policy even when using function approximation(More)
Curiosity towards exploring new objects in one's environment is a key driver of intelligent agents. We explore the problem of mapping in environments which are non-stationary, and where areas may exhibit different change patterns. This is an important challenge for potential " domestic " robots, which would have to perform tasks in houses. We propose a(More)
  • 1