Temporal logic motion control using actor–critic methods

  title={Temporal logic motion control using actor–critic methods},
  author={Xu Chu Ding and Jing Wang and Morteza Lahijanian and Ioannis Ch. Paschalidis and Calin A. Belta},
  journal={The International Journal of Robotics Research},
  pages={1329 - 1344}
  • X. DingJing Wang C. Belta
  • Published 10 February 2012
  • Computer Science
  • The International Journal of Robotics Research
This paper considers the problem of deploying a robot from a specification given as a temporal logic statement about some properties satisfied by the regions of a large, partitioned environment. We assume that the robot has noisy sensors and actuators and model its motion through the regions of the environment as a Markov decision process (MDP). The robot control problem becomes finding the control policy which maximizes the probability of satisfying the temporal logic task on the MDP. For a… 

Figures from this paper

Accelerated Reinforcement Learning for Temporal Logic Control Objectives

A novel accelerated model- based reinforcement learning (RL) algorithm for LTL control objectives that is capable of learning control policies significantly faster than related approaches is proposed.

Learning-Based Probabilistic LTL Motion Planning With Environment and Motion Uncertainties

A reinforcement learning-based approach is developed to generate policies that fulfill the desired LTL specifications as much as possible by optimizing the expected discount utility of the relaxed product MDP.

Safety-Critical Learning of Robot Control with Temporal Logic Specifications

This paper proposes a learning-based robotic control framework and shows an ECBF-based modular deep RL algorithm that achieves near-perfect success rates and safety guarding with high probability confidence during training.

Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

A model-free reinforcement learning algorithm to synthesize control policies that maximize the probability of satisfying high-level control objectives given as Linear Temporal Logic formulas, which is even more general than a fully unknown MDP.

Optimal Probabilistic Motion Planning with Potential Infeasible LTL Constraints

To the best of the knowledge, this work is the first work that bridges the gap between planning revision and optimal control synthesis of both plan prefix and plan suffix of the agent trajectory over the infinite horizon.

Reinforcement Learning Based Temporal Logic Control with Soft Constraints Using Limit-deterministic Generalized Büchi Automata

Rigorous analysis shows that any RL algorithm that optimizes the expected return is guaranteed to create policies that can satisfy the acceptance condition of relaxed product MDP and reduce the violation cost over long- term behaviors.

Reinforcement Learning Based Temporal Logic Control with Maximum Probabilistic Satisfaction

A model-free RL-based motion planning strategy is developed to generate the optimal policy that maximizes the satisfaction probability of complex tasks, which are expressed by linear temporal logic (LTL) specifications.

Robust Satisfaction of Temporal Logic Specifications via Reinforcement Learning

It is demonstrated via a pair of robot navigation simulation case studies that reinforcement learning with robustness maximization performs better than probability maximization in terms of both probability of satisfaction and expected robustness.

Analyzing and revising high-level robot behaviors under actuator error

The approach described in this paper composes probabilistic models of the environment behavior and the robot actuation error with the synthesized controller, and uses Probabilistic model checking techniques to find the probability that the robot satisfies a set of high level specifications.

Analyzing and revising synthesized controllers for robots with sensing and actuation errors

A method for probabilistically analyzing the behavior of a robot controller that is synthesized from a set of temporal logic specifications, when the robot operates with uncertainty in its sensing and actuation is described.



Least squares temporal difference actor-critic methods with applications to robot motion control

This work transforms the problem of finding a control policy for a Markov Decision Process (MDP) to maximize the probability of reaching some states while avoiding some other states into a Stochastic Shortest Path (SSP) problem and develops a new approximate dynamic programming algorithm to solve it.

LTL Control in Uncertain Environments with Probabilistic Satisfaction Guarantees

The problem of generating a control policy for a Markov Decision Process (MDP) such that the probability of satisfying an LTL formula over its states is maximized can be reduced to the problem of creating a robot control strategy that maximizes the probability to accomplish a task.

Motion planning and control from temporal logic specifications with probabilistic satisfaction guarantees

An algorithm inspired from probabilistic Computation Tree Logic (PCTL) model checking to find a control strategy that maximizes the probability of satisfying the specification is proposed.

Temporal Logic Motion Planning and Control With Probabilistic Satisfaction Guarantees

We describe a computational framework for automatic deployment of a robot with sensor and actuator noise from a temporal logic specification over a set of properties that are satisfied by the regions

Optimal and Efficient Stochastic Motion Planning in Partially-Known Environments

A framework capable of computing optimal control policies for a continuous system in the presence of both action and environment uncertainty is presented and experiments confirm that the framework recomputes high-quality policies in seconds and is orders of magnitude faster than existing methods.

Multi-Robot Motion Planning: A Timed Automata Approach

—This paper describes how a network of interacting timed automata can be used to model, analyze, and verify motion planning problems in a scenario with multiple robotic vehicles. The method

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

This work model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities, and develops a synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments.

Where's Waldo? Sensor-Based Temporal Logic Motion Planning

This paper provides a framework for automatically and verifiably composing controllers that satisfy high level task specifications expressed in suitable temporal logics that can express complex robot behaviors such as search and rescue, coverage, and collision avoidance.

A learning based approach to control synthesis of Markov decision processes for linear temporal logic specifications

We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that

Robust control of uncertain Markov Decision Processes with temporal logic specifications

A procedure from probabilistic model checking is used to combine the system model with an automaton representing the specification and this new MDP is transformed into an equivalent form that satisfies assumptions for stochastic shortest path dynamic programming.