— This paper presents the application of an approximate dynamic programming (ADP) algorithm to the problem of job releasing and sequencing of a benchmark reentrant manufacturing line (RML). The ADP approach is based on the SARSA(λ) algorithm with linear approximation structures that are tuned through a gradient-descent approach. The optimization is… (More)

This paper presents the application of a framework, proposed by the National Institute of Standards and Technology (NIST), for standard modular simulation in semiconductor wafer fabrication facilities (<i>fabs</i>). The application of the proposed framework resulted in the identification and specification of four different elements in the context of… (More)

— This paper presents an optimal policy for the problems of job releasing and sequencing in an adapted version of a benchmark Reentrant Manufacturing Line (RML). We consider a finite state space and an infinite horizon discounted cost optimization criteria. The resulting optimal policy provides a trade-off between throughput maximization (i.e., profits) and… (More)

This paper presents the application of a reinforcement learning (RL) approach for the near-optimal control of a re-entrant line manufacturing (RLM) model. The RL approach utilizes an algorithm based on a gradient-descent TD(λ) method to obtain both estimates of the optimal cost function and the control actions. Numerical experiments demonstrated the… (More)

SUMMARY In this paper, we present robust adaptive controller design for SISO linear systems with zero relative degree under noisy output measurements. We formulate the robust adaptive control problem as a nonlinear H ∞-optimal control problem under imperfect state measurements, and then solve it using game theory. By using the a priori knowledge of the… (More)

This paper presents initial results on the application of a simulation-based Approximate Dynamic Programming (ADP) for the control of the benchmark model of a semiconductor fab denominated the Intel Mini-Fab. The ADP approach utilized is based on an Average Cost Temporal-Difference TD(λ) learning algorithm and under an Actor-Critic architecture.… (More)

The sequential structure of complex actions is apparently learned at an abstract " cognitive " level in several regions of the frontal cortex, independent of the control of the immediate effectors by the motor system. At this level, actions are represented in terms of kinematic parameters – especially direction of end effector movement – and encoded using… (More)

- Matthew Flint, Emmanuel Fernandez
- 2005

This paper considers the decentralized dynamic programming path planning decision processes of multiple cooperating autonomous aerial vehicles (UAVs) engaged in a search of an uncertain environment. However, what sets this paper apart from previous work is that a functional approximation is used for the dynamic programming (DP) cost-to-go function,… (More)