Corpus ID: 227248002

DERAIL: Diagnostic Environments for Reward And Imitation Learning

@article{Freire2020DERAILDE,
  title={DERAIL: Diagnostic Environments for Reward And Imitation Learning},
  author={P. Freire and Adam Gleave and S. Toyer and Stuart Russell},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.01365}
}
  • P. Freire, Adam Gleave, +1 author Stuart Russell
  • Published 2020
  • Computer Science
  • ArXiv
  • The objective of many real-world tasks is complex and difficult to procedurally specify. This makes it necessary to use reward or imitation learning algorithms to infer a reward or policy directly from human data. Existing benchmarks for these algorithms focus on realism, testing in complex environments. Unfortunately, these benchmarks are slow, unreliable and cannot isolate failures. As a complementary approach, we develop a suite of simple diagnostic tasks that test individual facets of… CONTINUE READING

    References

    SHOWING 1-10 OF 33 REFERENCES
    Deep Reinforcement Learning from Human Preferences
    • 352
    • Highly Influential
    • PDF
    SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
    • 26
    Behaviour Suite for Reinforcement Learning
    • 51
    • PDF
    Generative Adversarial Imitation Learning
    • 969
    • Highly Influential
    • PDF
    Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
    • 233
    • Highly Influential
    • PDF
    Exploration by Random Network Distillation
    • 301
    • PDF
    A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
    • 1,357
    • PDF
    Deep Reinforcement Learning that Matters
    • 744
    • PDF
    Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
    • 128
    • PDF
    Simitate: A Hybrid Imitation Learning Benchmark
    • 8
    • PDF