Corpus ID: 173990609

Neural Replicator Dynamics

@article{Omidshafiei2019NeuralRD,
  title={Neural Replicator Dynamics},
  author={Shayegan Omidshafiei and D. Hennes and Dustin Morrill and R. Munos and Julien P{\'e}rolat and Marc Lanctot and A. Gruslys and Jean-Baptiste Lespiau and K. Tuyls},
  journal={ArXiv},
  year={2019},
  volume={abs/1906.00190}
}
  • Shayegan Omidshafiei, D. Hennes, +6 authors K. Tuyls
  • Published 2019
  • Computer Science, Mathematics
  • ArXiv
  • Policy gradient and actor-critic algorithms form the basis of many commonly used training techniques in deep reinforcement learning. Using these algorithms in multiagent environments poses problems such as nonstationarity and instability. In this paper, we first demonstrate that standard softmax-based policy gradient can be prone to poor performance in the presence of even the most benign nonstationarity. By contrast, it is known that the replicator dynamics, a well-studied model from… CONTINUE READING
    Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
    • 60
    • PDF
    SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference
    • 11
    • PDF
    Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications
    • 2
    • PDF
    DREAM: Deep Regret minimization with Advantage baselines and Model-free learning
    • 1
    • PDF
    Bounds for Approximate Regret-Matching Algorithms
    • 1
    • PDF
    ACCELERATED CENTRAL INFERENCE

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 98 REFERENCES
    Reinforcement Learning: An Introduction
    • 25,507
    • Highly Influential
    • PDF
    A decision-theoretic generalization of on-line learning and an application to boosting
    • 11,015
    • Highly Influential
    • PDF
    Continuous control with deep reinforcement learning
    • 3,529
    • Highly Influential
    • PDF
    Proximal Policy Optimization Algorithms
    • 2,716
    • PDF
    Asynchronous Methods for Deep Reinforcement Learning
    • 3,302
    • Highly Influential
    • PDF
    Evolutionary Games and Population Dynamics
    • 4,370
    • Highly Influential
    • PDF
    Trust Region Policy Optimization
    • 2,401
    • PDF
    Multiagent learning using a variable learning rate
    • 711
    • Highly Influential
    • PDF
    The Nonstochastic Multiarmed Bandit Problem
    • 1,561
    • PDF
    Evolutionarily Stable Strategies and Game Dynamics
    • 1,931
    • PDF