Corpus ID: 51867872

Teaching Multiple Tasks to an RL Agent using LTL

@inproceedings{Icarte2018TeachingMT,
  title={Teaching Multiple Tasks to an RL Agent using LTL},
  author={Rodrigo Toro Icarte and Toryn Q. Klassen and Richard Anthony Valenzano and Sheila A. McIlraith},
  booktitle={AAMAS},
  year={2018}
}
This paper examines the problem of how to teach multiple tasks to a Reinforcement Learning (RL) agent. To this end, we use Linear Temporal Logic (LTL) as a language for specifying multiple tasks in a manner that supports the composition of learned skills. We also propose a novel algorithm that exploits LTL progression and off-policy RL to speed up learning without compromising convergence guarantees, and show that our method outperforms the state-of-the-art approach on randomly generated… Expand
Advising and Instructing Reinforcement Learning Agents with LTL and Automata
Introduction Reinforcement learning (RL) involves an agent learning through interaction with an environment how to behave so as to maximize the expected cumulative reward (Sutton and Barto 1998).Expand
From STL Rulebooks to Rewards
TLDR
A principled approach to shaping rewards for reinforcement learning from multiple objectives that are given as a partially-ordered set of signal-temporallogic (STL) rules is proposed. Expand
Modular Deep Reinforcement Learning with Temporal Logic Specifications
We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse butExpand
Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement Learning
TLDR
This paper formally defines Extended Markov Games as a general mathematical model that allows multiple RL agents to concurrently learn various non-Markovian specifications and uses this model to train two different logic-based multi-agent RL algorithms to solve diverse settings of non- Markovian co-safe LTL specifications. Expand
Evolutionary reinforcement learning for sparse rewards
TLDR
GEATL is presented, the first hybrid on-policy evolutionary-based algorithm that combines the advantages of gradient learning in deep RL with the exploration ability of evolutionary algorithms, in order to solve the sparse reward problem pertaining to TL specifications. Expand
Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning
We present a reinforcement learning (RL) frame-work to synthesize a control policy from a given linear temporal logic (LTL) specification in an unknown stochastic environment that can be modeled as aExpand
Deep Reinforcement Learning with Temporal Logics
TLDR
This work proposes a deep Reinforcement Learning method for policy synthesis in continuous-state/action unknown environments, under requirements expressed in Linear Temporal Logic (LTL), and shows that this combination lifts the applicability of deep RL to complex temporal and memory-dependent policy synthesis goals. Expand
Symbolic Plans as High-Level Instructions for Reinforcement Learning
TLDR
An empirical evaluation shows that the use of techniques from knowledge representation and reasoning as a framework for defining final-state goal tasks and automatically producing their corresponding reward functions converges to near-optimal solutions faster than standard RL and HRL methods. Expand
Compositional RL Agents That Follow Language Commands in Temporal Logic
TLDR
A novel form of multi-task learning for RL agents is developed that allows them to learn from a diverse set of tasks and generalize to a new set of diverse tasks without any additional training. Expand
Imitation Learning over Heterogeneous Agents with Restraining Bolts
A common problem in Reinforcement Learning (RL) is that the reward function is hard to express. This can be overcome by resorting to Inverse Reinforcement Learning (IRL), which consists in firstExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 56 REFERENCES
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
TLDR
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods. Expand
Transfer from Multiple MDPs
TLDR
The theoretical properties of this transfer method are investigated and novel algorithms adapting the transfer process on the basis of the similarity between source and target tasks are introduced. Expand
Reinforcement Learning with Hierarchies of Machines
TLDR
This work presents provably convergent algorithms for problem-solving and learning with hierarchical machines and demonstrates their effectiveness on a problem with several thousand states. Expand
Environment-Independent Task Specifications via GLTL
TLDR
A new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent and extended to probabilistic specifications in a way that permits approximations to be learned in finite time. Expand
Integrating Human Demonstration and Reinforcement Learning : Initial Results in Human-Agent Transfer
This work introduces Human-Agent Transfer (HAT), a method that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance inExpand
Reinforcement Learning with a Hierarchy of Abstract Models
TLDR
Simulations on a set of compositionally-structured navigation tasks show that H-DYNA can learn to solve them faster than conventional RL algorithms, and the abstract models can be used to solve stochastic control tasks. Expand
Reinforcement learning with temporal logic rewards
  • Xiao Li, C. Vasile, C. Belta
  • Computer Science
  • 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2017
TLDR
It is shown in simulated trials that learning is faster and policies obtained using the proposed approach outperform the ones learned using heuristic rewards in terms of the robustness degree, i.e., how well the tasks are satisfied. Expand
Multitask Reinforcement Learning on the Distribution of MDPs
TLDR
This paper addresses a new problem in reinforcement learning by introducing an environmental class, BV-MDPs that is defined with the distribution of MDPs and focuses on statistical information (mean and deviation) about the agent’s value tables about its past learning experiences. Expand
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
TLDR
It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning. Expand
Hindsight Experience Replay
TLDR
A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum. Expand
...
1
2
3
4
5
...