• Corpus ID: 232148005

Decision-Making under On-Ramp merge Scenarios by Distributional Soft Actor-Critic Algorithm

@article{Kong2021DecisionMakingUO,
  title={Decision-Making under On-Ramp merge Scenarios by Distributional Soft Actor-Critic Algorithm},
  author={Yiting Kong and Yang Guan and Jingliang Duan and Shengbo Eben Li and Qi Sun and Bingbing Nie},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.04535}
}
Funding information This work is supported by International Science & Technology Cooperation Program of China under 2019YFE0100200, and also supported by Tsinghua University-Toyota Joint Research Center for AI Technology of Automated Vehicle Merging into the highway from the on-ramp is an essential scenario for automated driving. The decision-making under the scenario needs to balance the safety and efficiency performance to optimize a long-term objective, which is challenging due to the… 

References

SHOWING 1-10 OF 21 REFERENCES

Microscopic Traffic Simulation using SUMO

TLDR
The latest developments concerning intermodal traffic solutions, simulator coupling and model development and validation on the example of the open source traffic simulator SUMO are presented.

Mixed Reinforcement Learning for Efficient Policy Optimization in Stochastic Environments

  • Yao MuBaiyu Peng Bo Zhang
  • Computer Science
    2020 20th International Conference on Control, Automation and Systems (ICCAS)
  • 2020
TLDR
A mixed reinforcement learning (mixed RL) algorithm is proposed by simultaneously using dual representations of environmental dynamics to search the optimal policy by alternating between policy evaluation (PEV) and policy improvement (PIM).

Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization

TLDR
This paper proposes an RL training algorithm, model accelerated proximal policy optimization (MA-PPO), which incorporates a prior model into proximal Policy optimization (PPO) algorithm to accelerate the learning process in terms of sample efficiency and presents the design of state, action and reward to formulate centralized coordination as an RL problem.

Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data

TLDR
This study presents a hierarchical reinforcement learning method for decision making of self-driving cars, which does not depend on a large amount of labelled driving data and comprehensively considers both high-level manoeuvre selection and low-level motion control in both lateral and longitudinal directions.

Addressing Value Estimation Errors in Reinforcement Learning with a State-Action Return Distribution Function

TLDR
This work combines the distributional return function within the maximum entropy RL framework in order to develop what it calls the Distributional Soft Actor-Critic algorithm, DSAC, which is an off-policy method for continuous control setting and proposes a new Parallel Asynchronous Buffer-Actor-Learner architecture to improve the learning efficiency.

Direct and indirect reinforcement learning

TLDR
This paper classifies RL into direct and indirect RL according to how they seek the optimal policy of the Markov decision process problem, and shows that both of them can derive the actor–critic architecture and can be unified into a PG with the approximate value function and the stationary state distribution, revealing the equivalence of direct andirect RL.

Soft Actor-Critic Algorithms and Applications

TLDR
Soft Actor-Critic (SAC), the recently introduced off-policy actor-critic algorithm based on the maximum entropy RL framework, achieves state-of-the-art performance, outperforming prior on-policy and off- policy methods in sample-efficiency and asymptotic performance.

High-level Decision Making for Safe and Reasonable Autonomous Lane Changing using Reinforcement Learning

TLDR
A deep reinforcement learning (RL) agent is let to drive as close as possible to a desired velocity by executing reasonable lane changes on simulated highways with an arbitrary number of lanes by making use of a minimal state representation and a Deep Q-Network.

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

TLDR
This paper proposes soft actor-critic, an off-policy actor-Critic deep RL algorithm based on the maximum entropy reinforcement learning framework, and achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off- policy methods.