An Evolutionary Random Policy Search Algorithm for Solving Markov Decision Processes


T paper presents a new randomized search method called evolutionary random policy search (ERPS) for solving infinite-horizon discounted-cost Markov-decision-process (MDP) problems. The algorithm is particularly targeted at problems with large or uncountable action spaces. ERPS approaches a given MDP by iteratively dividing it into a sequence of smaller… (More)
DOI: 10.1287/ijoc.1050.0155
10 Figures and Tables


