Corpus ID: 186206882

Search on the Replay Buffer: Bridging Planning and Reinforcement Learning

@article{Eysenbach2019SearchOT,
  title={Search on the Replay Buffer: Bridging Planning and Reinforcement Learning},
  author={Benjamin Eysenbach and Ruslan Salakhutdinov and Sergey Levine},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.05253}
}
  • Benjamin Eysenbach, Ruslan Salakhutdinov, Sergey Levine
  • Published 2019
  • Computer Science
  • ArXiv
  • The history of learning for control has been an exciting back and forth between two broad classes of algorithms: planning and reinforcement learning. [...] Key Method Our main insight is that this graph can be constructed via reinforcement learning, where a goal-conditioned value function provides edge weights, and nodes are taken to be previously seen observations in a replay buffer.Expand Abstract

    Figures, Tables, and Topics from this paper.

    Paper Mentions

    Citations

    Publications citing this paper.
    SHOWING 1-10 OF 26 CITATIONS

    Exploration via Hindsight Goal Generation

    VIEW 1 EXCERPT
    CITES BACKGROUND

    Reinforcement Learning with Goal-Distance Gradient

    VIEW 5 EXCERPTS
    CITES BACKGROUND & METHODS
    HIGHLY INFLUENCED

    Sparse Graphical Memory for Robust Planning

    VIEW 7 EXCERPTS
    CITES BACKGROUND & METHODS
    HIGHLY INFLUENCED

    Adjust Planning Strategies to Accommodate Reinforcement Learning Agents

    VIEW 4 EXCERPTS
    CITES BACKGROUND
    HIGHLY INFLUENCED

    FILTER CITATIONS BY YEAR

    2019
    2020

    CITATION STATISTICS

    • 6 Highly Influenced Citations

    • Averaged 13 Citations per year from 2019 through 2020

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 66 REFERENCES

    Overcoming Exploration in Reinforcement Learning with Demonstrations

    VIEW 2 EXCERPTS

    Episodic Curiosity through Reachability

    VIEW 20 EXCERPTS
    HIGHLY INFLUENTIAL

    Continuous control with deep reinforcement learning

    VIEW 6 EXCERPTS
    HIGHLY INFLUENTIAL

    Learning Navigation Behaviors End-to-End With AutoRL

    VIEW 3 EXCERPTS