Targeting Specific Distributions of Trajectories in MDPs

@inproceedings{Roberts2006TargetingSD,
  title={Targeting Specific Distributions of Trajectories in MDPs},
  author={David L. Roberts and Mark J. Nelson and Charles Lee Isbell and Michael Mateas and Michael L. Littman},
  booktitle={AAAI},
  year={2006}
}
We define TTD-MDPs, a novel class of Markov decision processes where the traditional goal of an agent is changed from finding an optimal trajectory through a state space to realizing a specified distribution of trajectories through the space. After motivating this formulation, we show how to convert a traditional MDP into a TTD-MDP. We derive an algorithm for finding non-deterministic policies by constructing a trajectory tree that allows us to compute locally-consistent policies. We specify… CONTINUE READING
Highly Cited
This paper has 73 citations. REVIEW CITATIONS

3 Figures & Tables

Topics

Statistics

051015'07'08'09'10'11'12'13'14'15'16'17'18
Citations per Year

74 Citations

Semantic Scholar estimates that this publication has 74 citations based on the available data.

See our FAQ for additional information.