Corpus ID: 174797757

Off-Policy Evaluation via Off-Policy Classification

@article{Irpan2019OffPolicyEV,
  title={Off-Policy Evaluation via Off-Policy Classification},
  author={Alex Irpan and Kanishka Rao and Konstantinos Bousmalis and Chris Harris and Julian Ibarz and Sergey Levine},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.01624}
}
  • Alex Irpan, Kanishka Rao, +3 authors Sergey Levine
  • Published 2019
  • Computer Science, Mathematics
  • ArXiv
  • In this work, we consider the problem of model selection for deep reinforcement learning (RL) in real-world environments. Typically, the performance of deep RL algorithms is evaluated via on-policy interactions with the target environment. However, comparing models in a real-world environment for the purposes of early stopping or hyperparameter tuning is costly and often practically infeasible. This leads us to examine off-policy policy evaluation (OPE) in such settings. We focus on OPE for… CONTINUE READING

    Citations

    Publications citing this paper.
    SHOWING 1-7 OF 7 CITATIONS

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 39 REFERENCES