# L*-Based Learning of Markov Decision Processes (Extended Version)

@article{Tappler2021LBasedLO, title={L*-Based Learning of Markov Decision Processes (Extended Version)}, author={Martin Tappler and Bernhard K. Aichernig and Giovanni Bacci and Maria Eichlseder and Kim Guldstrand Larsen}, journal={Formal Aspects Comput.}, year={2021}, volume={33}, pages={575-615} }

Automata learning techniques automatically generate systemmodels fromtest observations. Typically, these techniques fall into two categories: passive and active. On the one hand, passive learning assumes no interaction with the system under learning and uses a predetermined training set, e.g., system logs. On the other hand, active learning techniques collect training data by actively querying the system under learning, allowing one to steer the discovery ofmeaningful information about the…

## 3 Citations

L*-Based Learning of Markov Decision Processes

- Computer ScienceFM
- 2019

This paper focuses on automata learning techniques, which automatically generate system models from test observations and actively queries the system under learning, which is considered more efficient.

Learning of Structurally Unambiguous Probabilistic Grammars

- Computer ScienceAAAI
- 2021

It is shown that the learned CMTA can be converted into a probabilistic grammar, thus providing a complete algorithm for learning a structurally unambiguous probabilism context free grammar using structured membership queries and structured equivalence queries.

## References

SHOWING 1-10 OF 71 REFERENCES

L*-Based Learning of Markov Decision Processes

- Computer ScienceFM
- 2019

This paper focuses on automata learning techniques, which automatically generate system models from test observations and actively queries the system under learning, which is considered more efficient.

Learning Probabilistic Systems from Tree Samples

- Computer Science2012 27th Annual IEEE Symposium on Logic in Computer Science
- 2012

This work considers the problem of learning a non-deterministic probabilistic system consistent with a given finite set of positive and negative tree samples and proposes learning algorithms that use traditional and a new stochastic state-space partitioning, the latter resulting in the minimum number of states.

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

- Computer ScienceRobotics: Science and Systems
- 2014

This work model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities, and develops a synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments.

Approximate Active Learning of Nondeterministic Input Output Transition Systems

- Computer ScienceElectron. Commun. Eur. Assoc. Softw. Sci. Technol.
- 2015

An adaptation of Angluin’s algorithm for active learning of nondeterministic, input-enabled, input-output transition systems is presented and allows to construct partial, or approximate models, by expressing the relation between the SUL and the learned model as a refinement relation, not necessarily an equivalence.

Active Learning of Markov Decision Processes for System Verification

- Computer Science2012 11th International Conference on Machine Learning and Applications
- 2012

An algorithm for learning deterministic Markov decision processes from data by actively guiding the selection of input actions is proposed and it is demonstrated that the proposed active learning procedure can significantly reduce the amount of data required to obtain accurate system models.

Learning Markov Decision Processes for Model Checking

- Computer ScienceQFM
- 2012

An algorithm for automatically learning a deterministic labeled Markov decision process model from the observed behavior of a reactive system, adapted from algorithms for learning deterministic Probabilistic finite automata and extended to include both probabilistic and nondeterministic transitions.

Active learning for extended finite state machines

- Computer ScienceFormal Aspects of Computing
- 2016

A black-box active learning algorithm for inferring extended finite state machines (EFSM) by dynamic black- box analysis based on a novel learning model based on so-called tree queries that induces a generalization of the classical Nerode equivalence and canonical automata construction to the symbolic setting.

Active Automata Learning in Practice - An Annotated Bibliography of the Years 2011 to 2016

- Computer ScienceMachine Learning for Dynamic Software Analysis
- 2018

The progress that has been made over the past five years is reviewed, the status of active automata learning techniques with respect to applications in the field of software engineering is assessed, and an updated agenda for future research is presented.

Learning Behaviors of Automata from Multiplicity and Equivalence Queries

- Computer Science, MathematicsSIAM J. Comput.
- 1996

It is likely that $\Ratviii$-automata are probably approximately correctly learnable (PAC-learnable) in polynomial time when multiplicity queries are allowed, and regular languages are polynomially predictable using membership queries with respect to the representation of unambiguous nondeterministic automata.