• Corpus ID: 244729393

MAMRL: Exploiting Multi-agent Meta Reinforcement Learning in WAN Traffic Engineering

@inproceedings{Sun2021MAMRLEM,
  title={MAMRL: Exploiting Multi-agent Meta Reinforcement Learning in WAN Traffic Engineering},
  author={Shan Sun and M. Kiran and Wei Ren},
  year={2021}
}
Traffic optimization challenges, such as load balancing, flow scheduling, and improving packet delivery time, are difficult online decision-making problems in wide area networks (WAN). Complex heuristics are needed for instance to find optimal paths that improve packet delivery time and minimize interruptions which may be caused by link failures or congestion. The recent success of reinforcement learning (RL) algorithms can provide useful solutions to build better robust systems that learn from… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 45 REFERENCES

Multi-agent deep learning for simultaneous optimization for time and energy in distributed routing system

Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning

A novel packet routing framework based on multiagent deep RL (DRL) in which each router possess an independent long short term memory (LSTM) recurrent neural network (RNN) for training and decision making in a fully distributed environment is proposed.

Application of reinforcement learning to routing in distributed wireless networks: a review

This article focuses on the application of the traditional, as well as the enhanced, RL models, to routing in wireless networks, and an extensive review on new features in RL-based routing, and how various routing challenges and problems have been approached using RL.

Dual Reinforcement Q-Routing: An On-Line Adaptive Routing Algorithm

Experiments over several network topologies have shown that at low loads, DRQ-Routing learns the optimal policy more than twice as fast as Q-Routed, and at high loads, it learns routing policies that are more than two as good as Q -Routing in terms of average packet delivery time.

Application of Deep Reinforcement Learning in Traffic Signal Control: An Overview and Impact of Open Traffic Data

This paper provides a comprehensive analysis of the most recent DRL approaches used for the ATSC algorithm design, with special emphasis on the traffic state representation and multi-agent DRL frameworks applied for the large traffic networks.

Ants and Reinforcement Learning: A Case Study in Routing in Dynamic Networks

Two new distributed routing algorithms for data networks based on simple biological "ants" that explore the network and rapidly learn good routes, using a novel variation of reinforcement learning are investigated, and they scale well with increase in network size-using a realistic topology.

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

Two methods using a multi-agent variant of importance sampling to naturally decay obsolete data and conditioning each agent's value function on a fingerprint that disambiguates the age of the data sampled from the replay memory enable the successful combination of experience replay with multi- agent RL.

Simulated Annealing Based Hierarchical Q-Routing: A Dynamic Routing Protocol

The paper proposes a routing algorithm called Simulated Annealing based Hierarchical Q-Routing which is able to improving convergence, loop avoidance, and scalability in comparison to Q- Routing.

Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach

In simple experiments involving a 36-node, irregularly connected network, Q-routing proves superior to a nonadaptive algorithm based on precomputed shortest paths and is able to route efficiently even when critical aspects of the simulation, such as the network load, are allowed to vary dynamically.

Networked Multi-Agent Reinforcement Learning in Continuous Spaces

This paper proposes a fully decentralized actor-critic algorithm that only relies on neighbor-to-neighbor communications among agents in a networked multi-agent reinforcement learning setting, and adopts the newly proposed expected policy gradient to reduce the variance of the gradient estimate.