• Corpus ID: 226227290

Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous Vehicles and Multi-Agent RL

  title={Optimizing Mixed Autonomy Traffic Flow With Decentralized Autonomous Vehicles and Multi-Agent RL},
  author={Eugene Vinitsky and Nathan Lichtl{\'e} and Kanaad Parvate and Alexandre M. Bayen},
We study the ability of autonomous vehicles to improve the throughput of a bottleneck using a fully decentralized control scheme in a mixed autonomy setting. We consider the problem of improving the throughput of a scaled model of the San Francisco-Oakland Bay Bridge: a two-stage bottleneck where four lanes reduce to two and then reduce to one. Although there is extensive work examining variants of bottleneck control in a centralized setting, there is less study of the challenging multi-agent… 
Learning a Robust Multiagent Driving Policy for Traffic Congestion Reduction
A learned multiagent driving policy that is robust to a variety of open-network traffic conditions, including vehicle flows, the fraction of AVs in traffic, AV placement, and different merging road geometries is presented.
Control of a Mixed Autonomy Signalised Urban Intersection: An Action-Delayed Reinforcement Learning Approach
This work considers a mixed autonomy scenario where the traffic intersection controller decides whether the traffic light will be green or red at each lane for multiple traffic-light blocks and proposes Reinforcement Learning based model-free algorithm to obtain the optimal policy.
Learning Generalizable Multi-Lane Mixed-Autonomy Behaviors in Single Lane Representations of Traffic
This paper designs a curriculum learning paradigm that exploits the natural extendability of the network to effectively learn behaviors that reduce congestion over long horizons and suggests that introducing lane change behaviors that even approximately match trends in more complex systems can significantly improve the generalizability of subsequent learned models.
Robustness and Adaptability of Reinforcement Learning-Based Cooperative Autonomous Driving in Mixed-Autonomy Traffic
The mixed-autonomy problem is formulated as a multi-agent reinforcement learning (MARL) problem and a decentralized framework and reward function for training cooperative AVs is proposed and enables AVs to learn the decision-making of HVs implicitly from experience and optimizes for a social utility while prioritizing safety and allowing adaptability.


Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Coordinating Vehicle Platoons for Highway Bottleneck Decongestion and Throughput Improvement
This paper uses a multi-class cell transmission model to capture the interaction between truck platoons and background traffic, and proposes a corresponding queuing model, which is used for control design and derived the estimated improvement in throughput.
Differential Variable Speed Limits Control for Freeway Recurrent Bottlenecks via Deep Reinforcement learning
A more effective deep reinforcement learning (DRL) model for differential variable speed limits (DVSL) control, in which the dynamic and different speed limits among lanes can be imposed, is proposed.
Toll Plaza Merging Traffic Control for Throughput Maximization
It is shown that the employed feedback regulator is little sensitive to various settings which indicates easy applicability with low fine-tuning needs in potential field applications, and the potential control concept efficiency is demonstrated.
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.
Continuous control with deep reinforcement learning
This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Feedback-Based Mainstream Traffic Flow Control for Multiple Bottlenecks on Motorways
The performance of the feedback controller is shown to approach the optimal control results, despite the fact that many practical and safety restrictions are additionally considered by the feedback controllers.
Towards Characterizing Divergence in Deep Q-Learning
An algorithm is developed which permits stable deep Q-learning for continuous control without any of the tricks conventionally used (such as target networks, adaptive gradient optimizers, or using multiple Q functions).
A new approach for combined freeway Variable Speed Limits and Coordinated Ramp Metering
A control strategy for combining Variable Speed Limits and Coordinated Ramp Metering design to achieve this objective when the bottleneck can be modeled as a lane drop (or virtual lane drop) and/or weaving section is proposed.
RLlib: Abstractions for Distributed Reinforcement Learning
This work argues for distributing RL components in a composable way by adapting algorithms for top-down hierarchical control, thereby encapsulating parallelism and resource requirements within short-running compute tasks, through RLlib: a library that provides scalable software primitives for RL.