Learn More
Although a manipulator must interact with objects in terms of their full complexity, it is the qualitative structure of the objects in an environment and the relationships between them which define the composition of that environment, and allow for the construction of efficient plans to enable the completion of various elaborate tasks. This paper presents(More)
We describe the first part of a study investigating the usefulness of high school language results as a predictor of success in first year computer science courses at a university where students have widely varying English language skills. Our results indicate that contrary to the generally accepted view that achievement in high school mathematics courses(More)
—The computational complexity of learning in sequential decision problems grows exponentially with the number of actions available to the agent at each state. We present a method for accelerating this process by learning action priors that express the usefulness of each action in each state. These are learned from a set of different optimal policies from(More)
Different interfaces allow a user to achieve the same end goal through different action sequences, e.g., command lines vs. drop down menus. Interface efficiency can be described in terms of a cost incurred, e.g., time taken, by the user in typical tasks. Realistic users arrive at evaluations of efficiency, hence making choices about which interface to use,(More)
In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an interaction. This work focuses on the problem of recognizing the strategy used by an agent within a small number of interactions. We propose using a Bayesian framework to address this problem. Bayesian policy reuse (BPR) has been(More)
We present algorithms to effectively represent a set of Markov decision processes (MDPs), whose optimal policies have already been learned, by a smaller source subset for lifelong, policy-reuse-based transfer learning in reinforcement learning. This is necessary when the number of previous tasks is large and the cost of measuring similarity counteracts the(More)
— We present a method for segmenting a set of unstructured demonstration trajectories to discover reusable skills using inverse reinforcement learning (IRL). Each skill is characterised by a latent reward function which the demonstrator is assumed to be optimizing. The skill boundaries and the number of skills making up each demonstration are unknown. We(More)
— This paper addresses the problem of acquiring a hierarchically structured robotic skill in a nonstationary environment. This is achieved through a combination of learning primitive strategies from observation of an expert, and autonomously synthesising composite strategies from that basis. Both aspects of this problem are approached from a game theoretic(More)