Logically-Constrained Reinforcement Learning
@article{Hasanbeig2018LogicallyConstrainedRL, title={Logically-Constrained Reinforcement Learning}, author={Mohammadhosein Hasanbeig and Alessandro Abate and D. Kroening}, journal={arXiv: Learning}, year={2018} }
We present the first model-free Reinforcement Learning (RL) algorithm to synthesise policies for an unknown Markov Decision Process (MDP), such that a linear time property is satisfied. The given temporal property is converted into a Limit Deterministic Buchi Automaton (LDBA) and a robust reward function is defined over the state-action pairs of the MDP according to the resulting LDBA. With this reward function, the policy synthesis procedure is "constrained" by the given specification. These… CONTINUE READING
Figures and Topics from this paper
22 Citations
Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees
- Computer Science, Engineering
- 2019 IEEE 58th Conference on Decision and Control (CDC)
- 2019
- 24
- PDF
Reinforcement Learning of Control Policy for Linear Temporal Logic Specifications Using Limit-Deterministic Generalized Büchi Automata
- Computer Science, Engineering
- IEEE Control Systems Letters
- 2020
- 4
- Highly Influenced
- PDF
Formal Policy Synthesis for Continuous-Space Systems via Reinforcement Learning
- Computer Science, Engineering
- IFM
- 2020
- 4
- PDF
Control Synthesis from Linear Temporal Logic Specifications using Model-Free Reinforcement Learning
- Computer Science
- 2020 IEEE International Conference on Robotics and Automation (ICRA)
- 2020
- 16
- PDF
Model-based Reinforcement Learning from Signal Temporal Logic Specifications
- Computer Science, Engineering
- ArXiv
- 2020
- PDF
References
SHOWING 1-10 OF 44 REFERENCES
Correct-by-synthesis reinforcement learning with temporal logic constraints
- Computer Science, Mathematics
- 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2015
- 37
- PDF
Reinforcement learning with temporal logic rewards
- Computer Science
- 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2017
- 53
- PDF
Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints
- Computer Science, Mathematics
- Robotics: Science and Systems
- 2014
- 106
- PDF
A learning based approach to control synthesis of Markov decision processes for linear temporal logic specifications
- Computer Science, Mathematics
- 53rd IEEE Conference on Decision and Control
- 2014
- 73
- Highly Influential
- PDF
Model-Based Reinforcement Learning in Continuous Environments Using Real-Time Constrained Optimization
- Computer Science
- AAAI
- 2015
- 14
- PDF
Verification of Markov Decision Processes Using Learning Algorithms
- Computer Science
- ATVA
- 2014
- 128
- Highly Influential
- PDF
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
- Computer Science
- Artif. Intell.
- 1999
- 2,336
- PDF
Verification and repair of control policies for safe reinforcement learning
- Computer Science
- Applied Intelligence
- 2017
- 14
Value Iteration for Long-Run Average Reward in Markov Decision Processes
- Computer Science
- CAV
- 2017
- 23
- PDF