A Bayesian Approach for Policy Learning from Trajectory Preference Queries

@inproceedings{Wilson2012ABA,
  title={A Bayesian Approach for Policy Learning from Trajectory Preference Queries},
  author={Aaron Wilson and Alan Fern and Prasad Tadepalli},
  booktitle={NIPS},
  year={2012}
}
We consider the problem of learning control policies via trajectory preference queries to an expert. In particular, the agent presents an expert with short runs of a pair of policies originating from the same state and the expert indicates which trajectory is preferred. The agent’s goal is to elicit a latent target policy from the expert with as few queries as possible. To tackle this problem we propose a novel Bayesian model of the querying process and introduce two methods that exploit this… CONTINUE READING

References

Publications referenced by this paper.

Similar Papers

Loading similar papers…