# Multi-agent Reinforcement Learning Accelerated MCMC on Multiscale Inversion Problem

@article{Chung2020MultiagentRL, title={Multi-agent Reinforcement Learning Accelerated MCMC on Multiscale Inversion Problem}, author={Eric T. Chung and Yalchin R. Efendiev and Wing Tat Leung and Sai-Mang Pun and Zecheng Zhang}, journal={ArXiv}, year={2020}, volume={abs/2011.08954} }

In this work, we propose a multi-agent actor-critic reinforcement learning (RL) algorithm to accelerate the multi-level Monte Carlo Markov Chain (MCMC) sampling algorithms. The policies (actors) of the agents are used to generate the proposal in the MCMC steps; and the critic, which is centralized, is in charge of estimating the long term reward. We verify our proposed algorithm by solving an inverse problem with multiple scales. There are several difficulties in the implementation of this…

## Figures and Tables from this paper

## 8 Citations

### Multi-variance replica exchange stochastic gradient MCMC for inverse and forward Bayesian physics-informed neural network

- Computer ScienceArXiv
- 2021

The proposed multi-variance replica exchange stochastic gradient Langevin diffusion method is employed to train the Bayesian PINN to solve the forward and inverse problems and significantly lowers the computational cost in the high temperature chain, meanwhile preserves the accuracy and converges very fast.

### Accelerated replica exchange stochastic gradient Langevin diffusion enhanced Bayesian DeepONet for solving noisy parametric PDEs

- Computer ScienceArXiv
- 2021

This work proposes an accelerated training framework for replica-exchange Langevin diffusion that exploits the neural network architecture of DeepONets to reduce its computational cost up to 25% without compromising the proposed framework’s performance.

### A deep neural network approach on solving the linear transport model under diffusive scaling

- Computer ScienceArXiv
- 2021

It is proved theoretically that the total loss vanishes as the neural network converges, upon which the neuralnetwork approximated solution converges pointwisely to the analytic solution of the linear transport model.

### Efficient hybrid explicit-implicit learning for multiscale problems

- Computer Science, MathematicsJ. Comput. Phys.
- 2022

### HEI: hybrid explicit-implicit learning for multiscale problems

- Computer ScienceArXiv
- 2021

The goal is to use temporal splitting concepts in designing machine learning algorithms and, at the same time, help splitting algorithms by incorporating data and speeding them up.

### Hybrid explicit-implicit learning for multiscale problems with time dependent source

- Computer ScienceArXiv
- 2022

The splitting method is a powerful method for solving partial diﬀerential equations. Various splitting methods have been designed to separate diﬀerent physics, nonlinearities, and so on. Recently, a…

### NH-PINN: Neural homogenization based physics-informed neural network for multiscale problems

- Computer ScienceJournal of Computational Physics
- 2022

### Theoretical and numerical studies of inverse source problem for the linear parabolic equation with sparse boundary measurements

- MathematicsArXiv
- 2021

Theoretically, it is proved that the flux data from any nonempty open subset of the boundary can uniquely determine the semi-discrete source, and that is why the data is called sparse boundary data.

## References

SHOWING 1-10 OF 45 REFERENCES

### Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

- Computer ScienceICML
- 2017

Two methods using a multi-agent variant of importance sampling to naturally decay obsolete data and conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory enable the successful combination of experience replay with multi- agent RL.

### Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

- Computer ScienceNIPS
- 2017

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

### Soft Actor-Critic Algorithms and Applications

- Computer ScienceArXiv
- 2018

Soft Actor-Critic (SAC), the recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework, achieves state-of-the-art performance, outperforming prior on-policy and off- policy methods in sample-efficiency and asymptotic performance.

### Actor-Attention-Critic for Multi-Agent Reinforcement Learning

- Computer ScienceICML
- 2019

This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.

### Counterfactual Multi-Agent Policy Gradients

- Computer ScienceAAAI
- 2018

A new multi-agent actor-critic method called counterfactual multi- agent (COMA) policy gradients that uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.

### Continuous control with deep reinforcement learning

- Computer ScienceICLR
- 2016

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

### Reinforcement Learning with Deep Energy-Based Policies

- Computer ScienceICML
- 2017

A method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before, is proposed and a new algorithm, called soft Q-learning, that expresses the optimal policy via a Boltzmann distribution is applied.

### Learn From Thy Neighbor: Parallel-Chain and Regional Adaptive MCMC

- Computer Science
- 2009

This paper draws attention to the deficient performance of standard adaptation when the target distribution is multimodal and proposes a parallel chain adaptation strategy that incorporates multiple Markov chains which are run in parallel.

### High-Dimensional Continuous Control Using Generalized Advantage Estimation

- Computer ScienceICLR
- 2016

This work addresses the large number of samples typically required and the difficulty of obtaining stable and steady improvement despite the nonstationarity of the incoming data by using value functions to substantially reduce the variance of policy gradient estimates at the cost of some bias.

### Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

- Computer ScienceNIPS
- 2016

h-DQN is presented, a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning, and allows for flexible goal specifications, such as functions over entities and relations.