Using Markov decision processes to optimise a non-linear functional of the final distribution , with manufacturing applications


We consider manufacturing problems which can be modelled as finite horizon Markov decision processes for which the effective reward function is either a strictly concave or strictly convex functional of the distribution of the final state. Reward structures such as these often arise when penalty factors are incorporated into the usual expected reward… (More)


  • Presentations referencing similar topics