Corpus ID: 220546016

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

@article{Xu2020MetaGradientRL,
  title={Meta-Gradient Reinforcement Learning with an Objective Discovered Online},
  author={Zhongwen Xu and H. V. Hasselt and Matteo Hessel and Junhyuk Oh and S. Singh and D. Silver},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.08433}
}
  • Zhongwen Xu, H. V. Hasselt, +3 authors D. Silver
  • Published 2020
  • Computer Science, Mathematics
  • ArXiv
  • Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience… CONTINUE READING
    Discovering Reinforcement Learning Algorithms

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 48 REFERENCES
    IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
    424
    Learning to learn by gradient descent by gradient descent
    816
    Local Gain Adaptation in Stochastic Gradient Descent
    162
    Meta-Learning via Learned Loss
    13
    Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
    2092
    S
    241962
    and S
    • 2018
    Beyond Exponentially Discounted Sum: Automatic Learning of Return Function
    2
    RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning
    349
    A Distributional Perspective on Reinforcement Learning
    382