• Corpus ID: 16885468

Optimizing Production Manufacturing Using Reinforcement Learning

  title={Optimizing Production Manufacturing Using Reinforcement Learning},
  author={Sridhar Mahadevan and Georgios Theocharous},
  booktitle={FLAIRS Conference},
Many industrial processes involve making parts with an assembly of machines, where each machine carries out an operation on a part, and the finished product requires a whole series of operations. A well-studied example of such a factory structure is the transfer line, which involves a sequence of machines. Optimizing transfer lines has been a subject of much study in the industrial engineering and operations research fields. A desirable goal of a lean manufacturing system is to maximize demand… 

Figures and Tables from this paper

Value Function Based Production Scheduling
This paper describes a Markov Decision Process (MDP) formulation of production scheduling which captures stochasticity in both production and demands and describes an industrial application and two reinforcement learning methods for generating an approximate value function on this domain.
Designing an adaptive production control system using reinforcement learning
This paper addresses the design of RL to create an adaptive production control system by the real-world example of order dispatching in a complex job shop, and examines the performance of the state, action, and reward function RL design.
Towards Optimal Assembly Line Order Sequencing with Reinforcement Learning: A Case Study
An approach to achieve production optimization by finding optimal sequences of orders, which yield high throughput using reinforcement learning is discussed, which shows promising accuracy of sequences against corresponding throughput during the simulation process.
Application of Reinforcement Learning for the Generation of an Assembly Plant Entry Control Policy
Simulation results show that the reinforcement learning agent produces production plans that achieve better productivity than the heuristic controller under tight planning horizons, generating sub-optimal yet acceptable production mix balance.
Manufacturing Scheduling Using Colored Petri Nets and Reinforcement Learning
Agent-based intelligent manufacturing control systems are capable to efficiently respond and adapt to environmental changes. Manufacturing system adaptation and evolution can be addressed with
Intelligent dynamic control policies for serial production lines
Heuristic production control policies such as CONWIP, kanban, and other hybrid policies have been in use for years as better alternatives to MRP-based push control policies. It is a fact that these
Application of reinforcement learning for agent-based production scheduling
Towards Standardizing Reinforcement Learning Approaches for Stochastic Production Scheduling
This work standardizes the description of production setups used in RL studies based on established nomenclature, and proposes recommendations for a validation scheme focusing on reproducibility and sufficient benchmarking.
Stochastic production scheduling to meet demand forecasts
This work describes a Markov decision process formulation of production scheduling which captures stochasticity, while retaining the ability to construct a schedule to meet demand forecasts, and describes an industrial application and a reinforcement learning method for generating an approximate value function.
Application of reinforcement learning to multi-agent production scheduling
Reinforcement learning (RL) has received attention in recent years from agent-based researchers because it can be applied to problems where autonomous agents learn to select proper actions for


Hierarchical Decision Making in Stochastic Manufacturing Systems
Most manufacturing systems are large, complex, and subject to uncertainty. Obtaining exact feedback policies to run these systems is nearly impossible, both theoretically and computationally. It is a
Improving Elevator Performance Using Reinforcement Learning
Results in simulation surpass the best of the heuristic elevator control algorithms of which the author is aware and demonstrate the power of RL on a very large scale stochastic dynamic optimization problem of practical utility.
Dynamic Programming and Optimal Control
The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential
Reinforcement Learning with Hierarchies of Machines
This work presents provably convergent algorithms for problem-solving and learning with hierarchical machines and demonstrates their effectiveness on a problem with several thousand states.
The NSF Workshop on Reinforcement Learning: Summary and Observations
The goals of the meeting were to understand limitations of current RL systems and deene promising directions for further research, clarify the relationships between RL and existing work in engineering elds, such as operations research, and identify potential industrial applications of RL.
Reinforcement Learning: A Survey
Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
Exploiting Structure in Policy Construction
This work presents an algorithm, called structured policy Iteration (SPI), that constructs optimal policies without explicit enumeration of the state space, and retains the fundamental computational steps of the commonly used modified policy iteration algorithm, but exploits the variable and prepositional independencies reflected in a temporal Bayesian network representation of MDPs.
A study of the Toyota production system from an industrial engineering viewpoint
* Mechanism of the production function* Improvement of process* Improvement of operation* Development of non-stock production* Interpretation of the Toyota Production System* Mechanism of TPS*
Markov Decision Processes: Discrete Stochastic Dynamic Programming
  • M. Puterman
  • Computer Science
    Wiley Series in Probability and Statistics
  • 1994
Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.
Probabilistic Robot Navigation in Partially Observable Environments
First results are reported on first results of a research program that uses par tially observable Markov models to robustly track a robots location in office environments and to direct its goal-oriented actions.