Corpus ID: 225075792

Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning

  title={Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning},
  author={Aviral Kumar and Rishabh Agarwal and Dibya Ghosh and S. Levine},
  • Aviral Kumar, Rishabh Agarwal, +1 author S. Levine
  • Published 2020
  • Computer Science, Mathematics
  • ArXiv
  • We identify an implicit under-parameterization phenomenon in value-based deep RL methods that use bootstrapping: when value functions, approximated using deep neural networks, are trained with gradient descent using iterated regression onto target values generated by previous instances of the value network, more gradient updates decrease the expressivity of the current value network. We characterize this loss of expressivity in terms of a drop in the rank of the learned value network features… CONTINUE READING
    1 Citations


    Diagnosing Bottlenecks in Deep Q-learning Algorithms
    • 39
    • Highly Influential
    • PDF
    On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
    • 170
    • Highly Influential
    • PDF
    Towards Characterizing Divergence in Deep Q-Learning
    • 36
    • PDF
    The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning
    • 4
    • Highly Influential
    • PDF
    Harnessing Structures for Value-Based Planning and Reinforcement Learning
    • 7
    • PDF
    A Theoretical Analysis of Deep Q-Learning
    • 83
    • Highly Influential
    • PDF
    DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
    • 10
    • PDF
    Off-Policy Deep Reinforcement Learning without Exploration
    • 177
    • PDF
    Deep Reinforcement Learning that Matters
    • 719
    • PDF