Corpus ID: 52904113

Generalization and Regularization in DQN

@article{Farebrother2018GeneralizationAR,
  title={Generalization and Regularization in DQN},
  author={Jesse Farebrother and Marlos C. Machado and Michael H. Bowling},
  journal={ArXiv},
  year={2018},
  volume={abs/1810.00123}
}
Deep reinforcement learning (RL) algorithms have shown an impressive ability to learn complex control policies in high-dimensional environments. However, despite the ever-increasing performance on popular benchmarks such as the Arcade Learning Environment (ALE), policies learned by deep RL algorithms often struggle to generalize when evaluated in remarkably similar environments. In this paper, we assess the generalization capabilities of DQN, one of the most traditional deep RL algorithms in… Expand
CONTINUOUS CONTROL
Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniquesExpand
Regularization Matters in Policy Optimization
TLDR
It is found conventional regularization techniques on the policy networks can often bring large improvement on the task performance, and the improvement is typically more significant when the task is more difficult. Expand
Improving Generalization in Reinforcement Learning with Mixture Regularization
TLDR
This work introduces a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments and imposes linearity constraints on the observation interpolations and the supervision and increases the data diversity more effectively and helps learn smoother policies. Expand
Deep Reinforcement Learning with Weighted Q-Learning
TLDR
This work provides the methodological advances to benefit from the WQL properties in Deep Reinforcement Learning (DRL), by using neural networks with Dropout Variational Inference as an effective approximation of deep Gaussian processes. Expand
Quantifying Generalization in Reinforcement Learning
TLDR
It is shown that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization. Expand
Efficient decorrelation of features using Gramian in Reinforcement Learning
TLDR
An online regularization framework for decorrelating features in RL is developed and it is proved that the proposed algorithm converges in the linear function approximation setting and does not change the main objective of maximizing cumulative reward. Expand
When Is Generalizable Reinforcement Learning Tractable?
TLDR
This work introduces Strong Proximity, a structural condition which precisely characterizes the relative closeness of different environments and shows that RL can require query complexity that is exponential in the horizon to generalize under a natural weakening of this condition. Expand
Monotonic Robust Policy Optimization with Model Discrepancy
TLDR
This paper theoretically derive a lower bound for the worst-case performance of a given policy by relating it to the expected performance, and develops a practical algorithm, named monotonic robust policy optimization (MRPO), which can generally improve both the average and worst- case performance in the source environments for training. Expand
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation
TLDR
This paper investigates causes of instability when using data augmentation in common offpolicy RL algorithms and proposes a simple yet effective technique for stabilizing this class of algorithms under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL. Expand
Automatic Data Augmentation for Generalization in Reinforcement Learning
Deep reinforcement learning (RL) agents often fail to generalize to unseen scenarios, even when they are trained on many instances of semantically similar environments. Data augmentation has recentlyExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 27 REFERENCES
A Study on Overfitting in Deep Reinforcement Learning
TLDR
This paper conducts a systematic study of standard RL agents and finds that they could overfit in various ways and calls for more principled and careful evaluation protocols in RL. Expand
Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
TLDR
It is concluded that reinforcement learning can work robustly in conjunction with function approximators, and that there is little justification at present for avoiding the case of general λ. Expand
Deep Reinforcement Learning that Matters
TLDR
Challenges posed by reproducibility, proper experimental techniques, and reporting procedures are investigated and guidelines to make future results in deep RL more reproducible are suggested. Expand
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
TLDR
A new distributed agent IMPALA (Importance Weighted Actor-Learner Architecture) is developed that not only uses resources more efficiently in single-machine training but also scales to thousands of machines without sacrificing data efficiency or resource utilisation. Expand
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
TLDR
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods. Expand
Procedural Level Generation Improves Generality of Deep Reinforcement Learning
TLDR
This paper presents an approach to prevent overfitting by generating more general agent controllers, through training the agent on a completely new and procedurally generated level each episode. Expand
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learningExpand
Distral: Robust multitask reinforcement learning
TLDR
This work proposes a new approach for joint training of multiple tasks, which it refers to as Distral (Distill & transfer learning), and shows that the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning. Expand
Human-level control through deep reinforcement learning
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks. Expand
Sharp Minima Can Generalize For Deep Nets
TLDR
It is argued that most notions of flatness are problematic for deep models and can not be directly applied to explain generalization, and when focusing on deep networks with rectifier units, the particular geometry of parameter space induced by the inherent symmetries that these architectures exhibit is exploited. Expand
...
1
2
3
...