Reinforcement Learning for Mixed Autonomy Intersections

  title={Reinforcement Learning for Mixed Autonomy Intersections},
  author={Zhongxia Yan and Cathy Wu},
  journal={2021 IEEE International Intelligent Transportation Systems Conference (ITSC)},
  • Zhongxia Yan, Cathy Wu
  • Published 19 September 2021
  • Computer Science
  • 2021 IEEE International Intelligent Transportation Systems Conference (ITSC)
We propose a model-free reinforcement learning method for controlling mixed autonomy traffic in simulated traffic networks with through-traffic-only two-way and four-way intersections. Our method utilizes multi-agent policy decomposition which allows decentralized control based on local observations for an arbitrary number of controlled vehicles. We demonstrate that, even without reward shaping, reinforcement learning learns to coordinate the vehicles to exhibit traffic signal-like behaviors… 

Figures and Tables from this paper

Learning Eco-Driving Strategies at Signalized Intersections
—Signalized intersections in arterial roads result in persistent vehicle idling and excess accelerations, contributing to fuel consumption and CO 2 emissions. There has thus been a line of work


Framework for control and deep reinforcement learning in traffic
  • Cathy Wu, Kanaad Parvate, A. Bayen
  • Computer Science
    2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC)
  • 2017
A framework called CISTAR (Customized Interface for SUMO, TraCI, and RLLab) is presented that integrates the widely used traffic simulator SUMO with a standard deep reinforcement learning library R LLab, creating an interface allowing for easy customization of SUMO.
Emergent Behaviors in Mixed-Autonomy Traffic
The present article formulates and approaches the mixed-autonomy traffic control problem using the powerful framework of deep reinforcement learning (RL) to provide insight for the potential for automation of traffic through mixed fleets of automated and manned vehicles.
Lagrangian Control through Deep-RL: Applications to Bottleneck Decongestion
Using deep reinforcement learning, novel control policies for autonomous vehicles are derived to improve the throughput of a bottleneck modeled after the San Francisco-Oakland Bay Bridge and it is shown that the AV controller provides comparable performance to ramp metering without the need to build new Ramp metering infrastructure.
CoLight: Learning Network-level Cooperation for Traffic Signal Control
The proposed model, CoLight, is the first to use graph attentional networks in the setting of reinforcement learning for traffic signal control and to conduct experiments on the large-scale road network with hundreds of traffic signals.
Cooperative Multi-agent Control Using Deep Reinforcement Learning
It is shown that policy gradient methods tend to outperform both temporal-difference and actor-critic methods and that curriculum learning is vital to scaling reinforcement learning algorithms in complex multi-agent domains.
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
Polling-Systems-Based Autonomous Vehicle Coordination in Traffic Intersections With No Traffic Signals
The proposed coordination control algorithm extends the widely studied polling systems analysis to the case involving customers subject to second-order differential constraints, providing provable guarantees on 1) safety, no collisions occur surely, and 2) performance, rigorous bounds on the expected delay.
A Protocol for Mixed Autonomous and Human-Operated Vehicles at Intersections
A new protocol, H-AIM, applicable as long as AIM is applicable and the infrastructure is able to sense approaching vehicles is introduced, that can decrease traffic delay for autonomous vehicles even at 1% technology penetration rate.
Planning, Learning and Coordination in Multiagent Decision Processes
The extent to which methods from single-agent planning and learning can be applied in multiagent settings is investigated and the decomposition of sequential decision processes so that coordination can be learned locally, at the level of individual states.
Policy Distillation
A novel method called policy distillation is presented that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient.