Corpus ID: 229221825

BeBold: Exploration Beyond the Boundary of Explored Regions

  title={BeBold: Exploration Beyond the Boundary of Explored Regions},
  author={Tianjun Zhang and Huazhe Xu and Xiaolong Wang and Yi Wu and K. Keutzer and J. Gonzalez and Y. Tian},
  • Tianjun Zhang, Huazhe Xu, +4 authors Y. Tian
  • Published 2020
  • Computer Science, Mathematics
  • ArXiv
  • Efficient exploration under sparse rewards remains a key challenge in deep reinforcement learning. To guide exploration, previous work makes extensive use of intrinsic reward (IR). There are many heuristics for IR, including visitation counts, curiosity, and state-difference. In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR. The criterion helps the agent explore Beyond the… CONTINUE READING

    Figures and Tables from this paper


    Go-Explore: a New Approach for Hard-Exploration Problems
    • 147
    • PDF
    RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments
    • 14
    • Highly Influential
    • PDF
    Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
    • 283
    • PDF
    VIME: Variational Information Maximizing Exploration
    • 402
    • PDF
    InfoBot: Transfer and Exploration via the Information Bottleneck
    • 75
    • PDF
    Self-Supervised Exploration via Disagreement
    • 67
    • PDF
    Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning
    • 100
    • PDF
    The NetHack Learning Environment
    • 11
    • Highly Influential
    • PDF