• Corpus ID: 247763011

Unentangled quantum reinforcement learning agents in the OpenAI Gym

  title={Unentangled quantum reinforcement learning agents in the OpenAI Gym},
  author={Jen-Yueh Hsiao and Yuxuan Du and Wei-Yin Chiang and Min-Hsiu Hsieh and Hsi-Sheng Goan},
Classical reinforcement learning (RL) has generated excellent results in different regions ; however, its sample inefficiency remains a critical issue. In this paper, we provide concrete numerical evidence that the sample efficiency (the speed of convergence) of quantum RL could be better than that of classical RL, and for achieving comparable learning performance, quantum RL could use much (at least one order of magnitude) fewer trainable parameters than classical RL. Specifically, we employ the… 


Quantum Architecture Search via Continual Reinforcement Learning
The Probabilistic Policy Reuse with deep Q-learning (PPR-DQL) framework is presented and it is demonstrated that the RL agent with PPR was able to find the quantum gate sequence to generate the two-qubit Bell state faster than the agent that was trained from scratch.
Quantum Architecture Search via Deep Reinforcement Learning
A quantum architecture search framework with the power of deep reinforcement learning (DRL) to address the challenge of generation of quantum gate sequences for multiqubit GHZ states without encoding any knowledge of quantum physics in the agent.
Introduction to Quantum Reinforcement Learning: Theory and PennyLane-based Implementation
This work will introduce the concept of quantum reinforcement learning using a variational quantum circuit, and confirm its possibility through implementation and experimentation, and guide the implementation method using the PennyLane library.
On exploring practical potentials of quantum auto-encoder with advantages
This work proves that QAE can be used to efficiently calculate the eigenvalues and prepare the corresponding eigenvectors of a high-dimensional quantum state with the low-rank property and proves that the error bounds of the proposed QAE-based methods outperform those in previous literature.
Hybrid quantum-classical classifier based on tensor network and variational quantum circuit
A hybrid model combining the quantum-inspired tensor networks (TN) and the variational quantum circuits (VQC) to perform supervised learning tasks, which allows for an end-to-end training and shows that a matrix product state based TN with low bond dimensions performs better than PCA as a feature extractor to compress data for the input of VQCs in the binary classification of MNIST dataset.
Quantum Optimization for Training Quantum Neural Networks
This paper coherently encode the cost function of QNNs onto relative phases of a superposition state in the Hilbert space of the network parameters, and tuned with an iterative quantum optimisation structure using adaptively selected Hamiltonians.
Quantum circuit architecture search: error mitigation and trainability enhancement for variational quantum solvers
QAS implicitly learns a rule that can well suppress the influence of quantum noise and the barren plateau and is implemented on both the numerical simulator and real quantum hardware via the IBM cloud to accomplish the data classification and the quantum ground state approximation tasks.
Efficient Measure for the Expressivity of Variational Quantum Algorithms.
The superiority of variational quantum algorithms (VQAs) such as quantum neural networks (QNNs) and variational quantum eigensolvers (VQEs) heavily depends on the expressivity of the employed
Generation of High Resolution Handwritten Digits with an Ion-Trap Quantum Computer.
This work implements a quantum-circuit based generative model to sample the prior distribution of a Generative Adversarial Network (GAN), and introduces a multi-basis technique which leverages the unique possibility of measuring quantum states in different bases, hence enhancing the expressibility of the prior distributions to be learned.
Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning
It is shown that by decoupling the task policy from the exploration policy, directed exploration can be highly effective for sample-efficient continuous control and when used in conjunction with soft actorcritic, DEEP incurs no performance penalty in densely-rewarding environments.