DRAMA: Joint Risk Localization and Captioning in Driving

  title={DRAMA: Joint Risk Localization and Captioning in Driving},
  author={Srikanth Malla and Chiho Choi and Isht Dwivedi and Joonhyang Choi and Jiachen Li},
  journal={2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
Considering the functionality of situational awareness in safety-critical automation systems, the perception of risk in driving scenes and its explainability is of particular importance for autonomous and cooperative driving. Toward this goal, this paper proposes a new research direction of joint risk localization in driving scenes and its risk explanation as a natural language description. Due to the lack of standard benchmarks, we collected a large-scale dataset, DRAMA (Driving Risk… 
1 Citations

Cognitive Accident Prediction in Driving Scenes: A Multimodality Benchmark

A Cognitive Accident Prediction (CAP) method that explicitly leverages human-inspired cognition of text description on the visual observation and the driver attention to facilitate model training is proposed and the superiority of CAP is validated compared with state-of-the-art approaches.



DADA: A Large-scale Benchmark and Model for Driver Attention Prediction in Accidental Scenarios

A multi-path semantic-guided attentive fusion network (MSAFNet) that learns the spatio-temporal semantic and scene variation in prediction that is applicable to driver attention prediction in accidental scenarios.

Talk2Car: Taking Control of Your Self-Driving Car

This work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars, and provides a detailed comparison with related datasets such as ReferIt, RefCOCO, Ref COCO+, RefC OCOg, Cityscape-Ref and CLEVR-Ref.

Predicting Driver Attention in Critical Situations

A new in-lab driver attention collection protocol is proposed and a new driver attention dataset is introduced, Berkeley DeepDrive Attention (BDD-A) dataset, which is built upon braking event videos selected from a large-scale, crowd-sourced driving video dataset.

Grounding Human-To-Vehicle Advice for Self-Driving Vehicles

It is shown that taking advice improves the performance of the end-to-end network, while the network cues on a variety of visual features that are provided by advice are provided.

Goal-oriented Object Importance Estimation in On-road Driving Videos

A novel framework that incorporates both visual model and goal representation to conduct Object Importance Estimation (OIE) in on-road driving videos is proposed and it is demonstrated that binary brake prediction can be improved with the information of object importance.

Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention

  • Jinkyu KimJ. Canny
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
This work uses a visual attention model to train a convolution network endto- end from images to steering angle and shows that the network causally cues on a variety of features that are used by humans while driving.

Textual Explanations for Self-Driving Vehicles

A new approach to introspective explanations is proposed which uses a visual (spatial) attention model to train a convolutional network end-to-end from images to the vehicle control commands, and two approaches to attention alignment, strong- and weak-alignment are explored.

Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning

This work presents the Honda Research Institute Driving Dataset (HDD), a challenging dataset to enable research on learning driver behavior in real-life environments and provides a detailed analysis of HDD with a comparison to other driving datasets.

DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving

A novel and publicly available dataset acquired during actual driving that contains drivers' gaze fixations and their temporal integration providing task-specific saliency maps and can foster new discussions on better understanding, exploiting and reproducing the driver's attention process in the autonomous and assisted cars of future generations.

Agent-Centric Risk Assessment: Accident Anticipation and Risky Region Localization

A novel soft-attention Recurrent Neural Network (RNN) which explicitly models both spatial and appearance-wise non-linear interaction between the agent triggering the event and another agent or static-region involved is proposed.