• Corpus ID: 244130081

GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving

  title={GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving},
  author={Raphael Chekroun and Marin Toromanoff and Sascha Hornauer and Fabien Moutarde},
In this paper, we propose General Reinforced Imitation (GRI), a novel method which combines benefits from exploration and expert data and is straightforward to implement over any off-policy RL algorithm. We make one simplifying hypothesis: expert demonstrations can be seen as perfect data whose underlying policy gets a constant high reward. Based on this assumption, GRI intro- duces the notion of offline demonstration agents. This agent sends expert data which are processed both concurrently and… 

Figures and Tables from this paper

Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks

This work presents a new method, able to leverage demonstrations and episodes collected online in any sparse-reward environment with any off-policy algorithm, based on a reward bonus given to demonstrations and successful episodes, encouraging expert imitation and self-imitation.

Learning from All Vehicles

A system to train driving policies from experiences collected not just from the ego-vehicle, but all vehicles that it observes, which outperforms all prior methods on the public CARLA Leaderboard by a wide margin.

DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement Learning in Imitation Learning Based Autonomous Driving

A Reinforcement Learning (RL) based methodology to DEtect and FIX failures of an Imitation Learning (IL) agent by extracting infraction spots and re-constructing mini -scenarios on these infraction areas to train an RL agent for fixing the shortcomings of the IL approach.

TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving

This work proposes TransFuser, a mechanism to integrate image and LiDAR representations using self-attention, which outperforms all prior work on the CARLA leaderboard in terms of driving score by a large margin.

Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline

Current end-to-end autonomous driving methods either run a controller based on a planned trajectory or perform control prediction directly, which have spanned two separately studied lines of

How to build and validate a safe and reliable Autonomous Driving stack? A ROS based software modular architecture baseline

A powerful ROS (Robot Operating System) based modular ADS is presented that achieves state-of-the-art results in challenging scenarios based on the CARLA (Car Learning to Act) simulator, outperforming several strong baselines in a novel evaluation setting which involves non-trivial traffic scenarios and adverse environmental conditions.

Exploiting map information for self-supervised learning in motion forecasting

An auxiliary task for trajectory prediction that takes advantage of map-only information such as graph connectivity with the intent of improving map comprehension and generalization is devised and applied through multitasking and pretraining.

Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer

A safety-enhanced autonomous driving framework, named Interpretable Sensor Fusion Transformer (InterFuser), to fully process and fuse information from multi-modal multi-view sensors for achieving comprehensive scene understanding and adversarial event detection is proposed.

MMFN: Multi-Modal-Fusion-Net for End-to-End Driving

This work proposes a novel approach to extract features from vectorized High-Definition (HD) maps and utilize them in the end-to-end driving tasks and designs a new expert to further enhance the model performance by considering multi-road rules.

ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning

This paper proposes a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously, which is called ST-P3 and is the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system.



Learning by Cheating

This work shows that this challenging learning problem can be simplified by decomposing it into two stages and uses the presented approach to train a vision-based autonomous driving system that substantially outperforms the state of the art on the CARLA benchmark and the recent NoCrash benchmark.

End-to-End Urban Driving by Imitating a Reinforcement Learning Coach

A reinforcement learning expert is trained that maps bird’s-eye view images to continuous low-level actions and provides in-formative supervision signals for imitation learning agents to learn from that achieves expert-level performance.

Continuous control with deep reinforcement learning

This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

End-to-End Model-Free Reinforcement Learning for Urban Driving Using Implicit Affordances

This work presents a novel technique, coined implicit affordances, to effectively leverage RL for urban driving thus including lane keeping, pedestrians and vehicles avoidance, and traffic light detection, and is the first to present a successful RL agent handling such a complex task especially regarding the traffic light Detection.

Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling

The method SACR2 based on reward relabeling improves the performance on this task, even in the absence of demonstrations, and is presented as a new method for sparse-reward tasks, based on a reward bonus given to demonstrations and successful episodes.

SQIL: Imitation Learning via Regularized Behavioral Cloning

A way to regularize behavioral cloning so that it generalizes to out-of-distribution states is suggested: combine the standard maximum-likelihood objective with a penalty on the soft Bellman error of the soft Q function.

Reinforcement Learning from Imperfect Demonstrations

This work proposes a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing theQ-values of actions unseen in the demonstration data, making NAC robust to suboptimal demonstration data.

Learning from Demonstrations for Real World Reinforcement Learning

This paper presents an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages this data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstrationData while learning thanks to a prioritized replay mechanism.

Exploring the Limitations of Behavior Cloning for Autonomous Driving

It is shown that behavior cloning leads to state-of-the-art results, executing complex lateral and longitudinal maneuvers, even in unseen environments, without being explicitly programmed to do so, and some limitations of the behavior cloning approach are confirmed.

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

This work shows that model-free DRL with natural policy gradients can effectively scale up to complex manipulation tasks with a high-dimensional 24-DoF hand, and solve them from scratch in simulated experiments.