An Introduction to Deep Reinforcement Learning

  title={An Introduction to Deep Reinforcement Learning},
  author={Vincent François-Lavet and Peter Henderson and Riashat Islam and Marc G. Bellemare and Joelle Pineau},
  journal={Found. Trends Mach. Learn.},
Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision-making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the… 

A Comprehensive Discussion on Deep Reinforcement Learning

  • Weikang XuLinbo ChenHongyu Yang
  • Computer Science
    2021 International Conference on Communications, Information System and Computer Engineering (CISCE)
  • 2021
An overview of recent achievements in deep reinforcement learning, an overview of deep RL applications, and the future ofDeep reinforcement learning are proposed.

Deep Reinforcement Learning Techniques in Diversified Domains: A Survey

It is found that even after obtaining good results in Atari, Go, Robotics, multi-agent scenarios, there are challenges such as generalization, satisfying multiple objectives, divergence, learning robust policy.

A survey and critique of multiagent deep reinforcement learning

A clear overview of current multiagent deep reinforcement learning (MDRL) literature is provided to help unify and motivate future research to take advantage of the abundant literature that exists in a joint effort to promote fruitful research in the multiagent community.

Deep Reinforcement Learning for the Control of Robotic Manipulation: A Focussed Mini-Review

This paper presents recent significant progress of deep reinforcement learning algorithms, which try to tackle the problems for the application in the domain of robotic manipulation control, such as sample efficiency and generalization.

Deep Reinforcement Learning: A State-of-the-Art Walkthrough

The key differences of the various kinds of algorithms are discussed, indicate their potential and limitations, as well as provide insights to researchers regarding future directions of the field.

QHD: A brain-inspired hyperdimensional reinforcement learning algorithm

QHD, a Hyperdimensional Reinforcement Learning that mimics brain properties towards robust and real-time learning, is proposed that relies on a lightweight brain-inspired model to learn an optimal policy in an unknown environment and is suitable for highly-efficient reinforcement learning in the edge environment.

Comparison of Multiple Reinforcement Learning and Deep Reinforcement Learning Methods for the Task Aimed at Achieving the Goal

Several reinforcement learning methods are compared for a task aimed at achieving a goal using robotics arm UR3 to minimize the Euclidean distance accuracy error and smooth the resulting path by the Bézier spline method.

A Very Condensed Survey and Critique of Multiagent Deep Reinforcement Learning

The primary goal of this extended abstract is to provide a broad overview of current multiagent deep reinforcement learning (MDRL) literature, hopefully motivating the reader to review the authors' 47page JAAMAS survey article [28].

A Survey on Deep Reinforcement Learning for Audio-Based Applications

A comprehensive survey on the progress of DRL in the audio domain by bringing together research studies across different but related areas in speech and music and presenting important challenges faced by audio-based DRL agents is presented.

State-of-the-Art Reinforcement Learning Algorithms

This research paper brings together many different aspects of the current research on several fields associated to Reinforcement Learning, providing a wide variety of learning algorithms like Markov Decision Processes, Q Learning, Temporal Difference Learning, Actor-Critic Algorithms, Deep Deterministic Policy Gradients, Evolution Strategies Algorithm.



Learning to reinforcement learn

This work introduces a novel approach to deep meta-reinforcement learning, which is a system that is trained using one RL algorithm, but whose recurrent dynamics implement a second, quite separate RL procedure.

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning

This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.

Stochastic Neural Networks for Hierarchical Reinforcement Learning

This work proposes a general framework that first learns useful skills in a pre-training environment, and then leverages the acquired skills for learning faster in downstream tasks, and uses Stochastic Neural Networks combined with an information-theoretic regularizer to efficiently pre-train a large span of skills.

Continuous Deep Q-Learning with Model-based Acceleration

This paper derives a continuous variant of the Q-learning algorithm, which it is called normalized advantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods, and substantially improves performance on a set of simulated robotic control tasks.

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies

When the discount factor progressively increases up to its final value, it is empirically shown that it is possible to significantly reduce the number of learning steps and the possibility to fall within a local optimum during the learning process, thus connecting the discussion with the exploration/exploitation dilemma.

A Study on Overfitting in Deep Reinforcement Learning

This paper conducts a systematic study of standard RL agents and finds that they could overfit in various ways and calls for more principled and careful evaluation protocols in RL.

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.

Asynchronous Methods for Deep Reinforcement Learning

A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

Recurrent Reinforcement Learning: A Hybrid Approach

This work investigates a deep-learning approach to learning the representation of states in partially observable tasks, with minimal prior knowledge of the domain, and proposes a new family of hybrid models that combines the strength of both supervised learning and reinforcement learning, trained in a joint fashion.

Continuous control with deep reinforcement learning

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.