Deep Reinforcement Learning With Quantum-Inspired Experience Replay

  title={Deep Reinforcement Learning With Quantum-Inspired Experience Replay},
  author={Qing Wei and Hailan Ma and Chunlin Chen and D. Dong},
  journal={IEEE Transactions on Cybernetics},
In this article, a novel training paradigm inspired by quantum computation is proposed for deep reinforcement learning (DRL) with experience replay. In contrast to the traditional experience replay mechanism in DRL, the proposed DRL with quantum-inspired experience replay (DRL-QER) adaptively chooses experiences from the replay buffer according to the complexity and the replayed times of each experience (also called transition), to achieve a balance between exploration and exploitation. In DRL… 

Figures and Tables from this paper

Asynchronous Curriculum Experience Replay: A Deep Reinforcement Learning Approach for UAV Autonomous Motion Control in Unknown Dynamic Environments

The autonomous motion control (AMC) problem is formulated as a Markov decision process (MDP) and an advanced deep reinforcement learning (DRL) method that allows UAVs to execute complex tasks in large-scale dynamic three-dimensional (3D) environments is proposed.

Variational Quantum Soft Actor-Critic

This work develops a quantum reinforcement learning algorithm based on soft actor-critic, a hybrid quantum-classical policy network consisting of a variational quantum circuit and a classical artificial neural network, and analyzes the effect of different hyper-parameters and policy network architectures.

Path Planning for Cellular-Connected UAV: A DRL Solution with Quantum-Inspired Experience Replay

To help the DRL agent commit a better trade-off between sampling priority and diversity, a novel quantum-inspired experience replay (QiER) framework is proposed, via relating experienced transition’s importance to its associated quantum bit (qubit) and applying Grover iteration based amplitude amplification technique.

Quantum Language Model with Entanglement Embedding for Question Answering

This work proposes a NN model with a novel entanglement embedding (EE) module, whose function is to transform the word sequence into an entangled pure state representation, and shows that QLM-EE achieves superior performance compared with the classical deep NN models and other QLMs on question answering (QA) datasets.

RIS-Assisted Multi-Antenna AmBC Signal Detection Using Deep Reinforcement Learning

An efficient multi-antenna AmBC system is developed based on RIS, which can achieve information transmission and energy collection simultaneously and a smart twin delayed deep deterministic (TD3) AmBC signal detection method is presented, based on deep reinforcement learning.

Crypto Makes AI Evolve

The path and stages of this evolution of AI, focusing on the role of quantum-inspired and bio-inspired AI, are studied, including Crypto-Sensitive AI, Crypto-Adapted AI, crypto-Friendly AI , Crypto-Enabled AI, and Crypto-Protected AI.

AI-Assisted Authentication: State of the Art, Taxonomy and Future Roadmap

This research is the first of its kind to focus on the roles of AI in authentication, which is used in a wide range of scenarios including facial recognition to access buildings, keystroke dynamics to unlock smartphones.

The Dichotomy of Cloud and IoT: Cloud-Assisted IoT From a Security Perspective

This study starts with reviewing existing relevant surveys, noting their shortcomings, which motivate a comprehensive survey in this area, and highlights existing approaches towards the design of Secure CAIoT (SCAIoT) along with related security challenges and controls, and develops a layered architecture for SC AIoT.

Turning the Hunted into the Hunter via Threat Hunting: Life Cycle, Ecosystem, Challenges and the Great Promise of AI

A life cycle and ecosystem for privacy-threat hunting is established in addition to identifying the related challenges, and how critical the use of AI is in threat hunting is discovered.

Privacy-Preserving Cloud Computing: Ecosystem, Life Cycle, Layered Architecture and Future Roadmap

This paper helps to identify existing trends by establishing a layered architecture along with a life cycle and an ecosystem for privacy-preserving cloud systems in addition to identifying the existing trends in research on this area.



Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning

The experimental results show that the proposed curriculum training paradigm of DCRL is also applicable and effective for other memory-based deep reinforcement learning approaches, such as double DQN and dueling network.

Competitive Experience Replay

This work proposes a novel method called competitive experience replay, which efficiently supplements a sparse reward by placing learning in the context of an exploration competition between a pair of agents, creating a competitive game designed to drive exploration.

Remember and Forget for Experience Replay

Remember and Forget Experience Replay (ReF-ER) is introduced, a novel method that can enhance RL algorithms with parameterized policies and consistently improves the performance of continuous-action, off-policy RL on fully observable benchmarks and partially observable flow control problems.

Attentive Experience Replay

Attentive Experience Replay is introduced, a novel experience replay algorithm that samples transitions according to the similarities between their states and the agent's state and it is demonstrated that AER makes consistent improvements on the suite of OpenAI gym tasks.

Deep Reinforcement Learning with Double Q-Learning

This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

Fidelity-Based Probabilistic Q-Learning for Control of Quantum Systems

A fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems and shows that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.

The importance of experience replay database composition in deep reinforcement learning

The potential of the Deep Deterministic Policy Gradient method for a robot control problem both in simulation and in a real setup is investigated and some requirements on the distribution over the state-action space of the experiences in the database are identified.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Dueling Network Architectures for Deep Reinforcement Learning

This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.

Control of exploitation-exploration meta-parameter in reinforcement learning