Corpus ID: 724786

Interactive Value Iteration for Markov Decision Processes with Unknown Rewards

@inproceedings{Weng2013InteractiveVI,
  title={Interactive Value Iteration for Markov Decision Processes with Unknown Rewards},
  author={Paul Weng and Bruno Zanuttini},
  booktitle={IJCAI},
  year={2013}
}
  • Paul Weng, Bruno Zanuttini
  • Published in IJCAI 2013
  • Computer Science
  • To tackle the potentially hard task of defining the reward function in a Markov Decision Process, we propose a new approach, based on Value Iteration, which interweaves the elicitation and optimization phases. We assume that rewards whose numeric values are unknown can only be ordered, and that a tutor is present to help comparing sequences of rewards. We first show how the set of possible reward functions for a given preference relation can be represented as a polytope. Then our algorithm… CONTINUE READING

    Create an AI-powered research feed to stay up to date with new papers like this posted to ArXiv

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 19 CITATIONS

    Advantage based value iteration for Markov decision processes with unknown rewards

    VIEW 19 EXCERPTS
    CITES METHODS, BACKGROUND & RESULTS
    HIGHLY INFLUENCED

    Solving MDPs with Unknown Rewards Using Nondominated Vector-Valued Functions

    VIEW 6 EXCERPTS
    CITES METHODS & BACKGROUND
    HIGHLY INFLUENCED

    Approximate regret based elicitation in Markov decision process

    VIEW 6 EXCERPTS
    CITES BACKGROUND & METHODS
    HIGHLY INFLUENCED

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 20 REFERENCES

    Robust Online Optimization of Reward-Uncertain MDPs

    VIEW 9 EXCERPTS
    HIGHLY INFLUENTIAL

    Online feature elicitation in interactive optimization

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Regret-based Reward Elicitation for Markov Decision Processes

    VIEW 5 EXCERPTS
    HIGHLY INFLUENTIAL

    Decision-Theoretic Planning: Structural Assumptions and Computational Leverage

    VIEW 14 EXCERPTS
    HIGHLY INFLUENTIAL

    Boutilier

    • C P. Viappiani
    • Recommendation sets and choice queries: There is no exploration/exploitation tradeoff! In AAAI,
    • 2011
    VIEW 1 EXCERPT