# L*-Based Learning of Markov Decision Processes (Extended Version)

@article{Tappler2021LBasedLO, title={L*-Based Learning of Markov Decision Processes (Extended Version)}, author={Martin Tappler and Bernhard K. Aichernig and Giovanni Bacci and Maria Eichlseder and Kim Guldstrand Larsen}, journal={Formal Aspects Comput.}, year={2021}, volume={33}, pages={575-615} }

Automata learning techniques automatically generate systemmodels fromtest observations. Typically, these techniques fall into two categories: passive and active. On the one hand, passive learning assumes no interaction with the system under learning and uses a predetermined training set, e.g., system logs. On the other hand, active learning techniques collect training data by actively querying the system under learning, allowing one to steer the discovery ofmeaningful information about the…

## Figures, Tables, and Topics from this paper

## 2 Citations

L*-Based Learning of Markov Decision Processes

- Computer ScienceFM
- 2019

This paper focuses on automata learning techniques, which automatically generate system models from test observations and actively queries the system under learning, which is considered more efficient.

## References

SHOWING 1-10 OF 79 REFERENCES

L*-Based Learning of Markov Decision Processes

- Computer ScienceFM
- 2019

This paper focuses on automata learning techniques, which automatically generate system models from test observations and actively queries the system under learning, which is considered more efficient.

Learning Probabilistic Systems from Tree Samples

- Computer Science, Mathematics2012 27th Annual IEEE Symposium on Logic in Computer Science
- 2012

This work considers the problem of learning a non-deterministic probabilistic system consistent with a given finite set of positive and negative tree samples and proposes learning algorithms that use traditional and a new stochastic state-space partitioning, the latter resulting in the minimum number of states.

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

- Computer Science, MathematicsRobotics: Science and Systems
- 2014

This work model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities, and develops a synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments.

Approximate Active Learning of Nondeterministic Input Output Transition Systems

- Computer ScienceElectron. Commun. Eur. Assoc. Softw. Sci. Technol.
- 2015

An adaptation of Angluin’s algorithm for active learning of nondeterministic, input-enabled, input-output transition systems is presented and allows to construct partial, or approximate models, by expressing the relation between the SUL and the learned model as a refinement relation, not necessarily an equivalence.

Active Learning of Markov Decision Processes for System Verification

- Computer Science2012 11th International Conference on Machine Learning and Applications
- 2012

An algorithm for learning deterministic Markov decision processes from data by actively guiding the selection of input actions is proposed and it is demonstrated that the proposed active learning procedure can significantly reduce the amount of data required to obtain accurate system models.

Learning Markov Decision Processes for Model Checking

- Computer ScienceQFM
- 2012

An algorithm for automatically learning a deterministic labeled Markov decision process model from the observed behavior of a reactive system, adapted from algorithms for learning deterministic Probabilistic finite automata and extended to include both probabilistic and nondeterministic transitions.

Active learning for extended finite state machines

- Computer ScienceFormal Aspects of Computing
- 2016

A black-box active learning algorithm for inferring extended finite state machines (EFSM) by dynamic black- box analysis based on a novel learning model based on so-called tree queries that induces a generalization of the classical Nerode equivalence and canonical automata construction to the symbolic setting.

Active Automata Learning in Practice - An Annotated Bibliography of the Years 2011 to 2016

- Computer ScienceMachine Learning for Dynamic Software Analysis
- 2018

The progress that has been made over the past five years is reviewed, the status of active automata learning techniques with respect to applications in the field of software engineering is assessed, and an updated agenda for future research is presented.

Learning Regular Sets from Queries and Counterexamples

- Computer Science, MathematicsInf. Comput.
- 1987

A learning algorithm L* is described that correctly learns any regular set from any minimally adequate Teacher in time polynomial in the number of states of the minimum dfa for the set and the maximum length of any counterexample provided by the Teacher.

Learning deterministic probabilistic automata from a model checking perspective

- Computer Science, MathematicsMachine Learning
- 2016

This paper shows how to extend the basic algorithm to also learn automata models for both reactive and timed systems and establishes theoretical convergence properties for the learning algorithm as well as for probability estimates of system properties expressed in linear time temporal logic and linear continuous stochastic logic.