Markov Decision Processes: Concepts and Algorithms
@inproceedings{Otterlo2012MarkovDP, title={Markov Decision Processes: Concepts and Algorithms}, author={Martijn van Otterlo and Marco A Wiering}, year={2012} }
Situated in between supervised learning and unsupervised learning, the paradigm of reinforcement learning deals with learning in sequential decision making problems in which there is limited feedback. This text introduces the intuitions and concepts behind Markov decision processes and two classes of algorithms for computing optimal behaviors: reinforcement learning and dynamic programming. First the formal framework of Markov decision process is defined, accompanied by the definition of value…
8 Citations
State-of-the-Art Reinforcement Learning Algorithms
- Computer Science
- 2020
This research paper brings together many different aspects of the current research on several fields associated to Reinforcement Learning, providing a wide variety of learning algorithms like Markov Decision Processes, Q Learning, Temporal Difference Learning, Actor-Critic Algorithms, Deep Deterministic Policy Gradients, Evolution Strategies Algorithm.
Markov decision processes for services opportunity pipeline optimization
- Computer Science
- 2010
This project modelled the dynamics of sales opportunities by a Markov Decision Process, where states are modelled by the number of opportunities at five different maturity levels, actions are represented by investments and rewards by profits from signed contracts.
Dynamic Programming Models for Maximizing Customer Lifetime Value: An Overview
- BusinessIntelliSys
- 2019
Neither DDQN nor the algorithms that outperformed it and overcame its limitations were applied in the area of direct marketing and this leaves a space for future research directions.
Modelling sustainable supply networks with adaptive agents
- Business2018 International Conference on Production and Operations Management Society (POMS)
- 2018
Simulation results generated through the application of this multi-agent modelling approach to a refrigerator production network show that the selected supply network configurations are capable of meeting intended sustainable goals while catering to the respective product-market profiles.
Multiscale modeling meets machine learning: What can we learn?
- Computer Science, BiologyArchives of computational methods in engineering : state of the art reviews
- 2021
This review identifies areas in the biomedical sciences where machine learning and multiscale modeling can mutually benefit from one another and identifies applications and opportunities, raise open questions, and address potential challenges and limitations.
Mobile Edge Offloading Using Markov Decision Processes
- Computer ScienceEDGE
- 2018
This study proposes a Markov Decision Process (MDP) based methodology to intelligently make such choices while optimizing multiple objectives and demonstrates an 17.47% or greater increase in performance.
Smart grid-aware radio engineering in 5G mobile networks. (Ingénierie radio orientée smart grids dans les réseaux mobiles 5G)
- Computer Science, Political Science
- 2019
La demande en energie dans les reseaux de telephonie mobile augmente en raison de l’emergence de nouvelles technologies et de nouveaux services aux exigences de plus en plus elevees (debits de…
References
SHOWING 1-10 OF 38 REFERENCES
On characteristics of markov decision processes and reinforcement learning in large domains
- Computer Science
- 2005
This thesis proposes a new technique for automatically selecting certain structural parameters of the SDM model on-line based on training data and develops one application that uses measurements of the proposed attributes for improving exploration, the process by which the agent gathers experience for learning good behavior strategies.
The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains
- Computer Science
- 2008
This book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting, and a methodological translation is constructed from the propositional to the relational setting.
Markov Decision Processes: Discrete Stochastic Dynamic Programming
- Computer ScienceWiley Series in Probability and Statistics
- 1994
Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.
Reinforcement Learning: An Introduction
- Computer ScienceIEEE Transactions on Neural Networks
- 2005
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage
- Computer ScienceJ. Artif. Intell. Res.
- 1999
This paper presents an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI, and describes structural properties of M DPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans.
Problem solving with reinforcement learning
- Computer Science
- 1995
This thesis is concerned with practical issues surrounding the application of reinforcement learning techniques to tasks that take place in high dimensional continuous state-space environments. In…
Knowledge Representation for Stochastic Decision Process
- Computer ScienceArtificial Intelligence Today
- 1999
It is argued that while stochastic dynamics introduce certain complications when it comes to issues such as the frame and ramification problems, for the most part, intuitions underlying classical models can be extended to the Stochastic setting.
On-line Q-learning using connectionist systems
- Computer Science
- 1994
Simulations show that on-line learning algorithms are less sensitive to the choice of training parameters than backward replay, and that the alternative update rules of MCQ-L and Q( ) are more robust than standard Q-learning updates.
Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results
- Computer ScienceMachine Learning
- 2005
This paper presents a detailed study of average reward reinforcement learning, an undiscounted optimality framework that is more appropriate for cyclical tasks than the much better studied discounted framework, and a detailed sensitivity analysis of R-learning is carried out to test its dependence on learning rates and exploration levels.
Average reward reinforcement learning: Foundations, algorithms, and empirical results
- Computer ScienceMachine Learning
- 2004
This paper presents a detailed study of average reward reinforcement learning, an undiscounted optimality framework that is more appropriate for cyclical tasks than the much better studied discounted framework, and a detailed sensitivity analysis of R-learning is carried out to test its dependence on learning rates and exploration levels.