# Meta-Gradient Reinforcement Learning with an Objective Discovered Online

@article{Xu2020MetaGradientRL, title={Meta-Gradient Reinforcement Learning with an Objective Discovered Online}, author={Zhongwen Xu and H. V. Hasselt and Matteo Hessel and Junhyuk Oh and S. Singh and D. Silver}, journal={ArXiv}, year={2020}, volume={abs/2007.08433} }

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience… CONTINUE READING

One Citation

#### References

##### Publications referenced by this paper.

SHOWING 1-10 OF 48 REFERENCES

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

- Computer Science, Mathematics
- 2018

424

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

- Computer Science, Mathematics
- 2017

2092

and S

- 2018

Beyond Exponentially Discounted Sum: Automatic Learning of Return Function

- Mathematics, Computer Science
- 2019

2

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning

- Computer Science, Mathematics
- 2016

349