Towards Anomaly Detection in Dashcam Videos

  title={Towards Anomaly Detection in Dashcam Videos},
  author={Sanjay Haresh and Sateesh Kumar and M. Zeeshan Zia and Quoc-Huy Tran},
  journal={2020 IEEE Intelligent Vehicles Symposium (IV)},
Inexpensive sensing and computation, as well as insurance innovations, have made smart dashboard cameras ubiquitous. Increasingly, simple model-driven computer vision algorithms focused on lane departures or safe following distances are finding their way into these devices. Unfortunately, the long-tailed distribution of road hazards means that these hand-crafted pipelines are inadequate for driver safety systems. We propose to apply data-driven anomaly detection ideas from deep learning to… 

Figures and Tables from this paper

Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles

This paper addresses VAD by solving an intuitive yet challenging pretext task, i.e. spatio-temporal jigsaw puzzles, which is cast as a multi-label fine-grained classification problem, and outperforms state-of-the-art counterparts on three public benchmarks.

Augmenting Ego-Vehicle for Traffic Near-Miss and Accident Classification Dataset using Manipulating Conditional Style Translation

Extending the start and end time of accident duration can precisely cover all ego-motions during an incident and consistently classify all possible traffic risk accidents including near-miss to give more critical information for real- world driving assistance systems.

MM-trafficEvent: An Interactive Incident Retrieval System for First-view Travel-log Data

An interactive incident detection and retrieval system for first-view travel-log data, namely MM-trafficEvent, that can be online defined-incident detection, offline fine-grained incident retrieval for both defined and undefined incidents, and offline automatically new incident class creating using user’s queries is introduced.

Using Visual Anomaly Detection for Task Execution Monitoring

This work learns to predict the motions that occur during the nominal execution of a task, including camera and robot body motion, using a probabilistic U-Net architecture to predict optical flow and a robot’s kinematics and 3D model to model camera and body motion.

Timestamp-Supervised Action Segmentation with Graph Convolutional Networks

A graph convolutional network is learned in an end-to-end manner to exploit both frame features and connections between neighboring frames to generate dense framewise labels from sparse timestamp labels.

Learning by Aligning Videos in Time

We present a self-supervised approach for learning video representations using temporal video alignment as a pretext task, while exploiting both frame-level and video-level information. We leverage a

A Survey of Single-Scene Video Anomaly Detection

This article summarizes research trends on the topic of anomaly detection in video feeds of a single scene and categorizes and situates past research into an intuitive taxonomy, and provides a comprehensive comparison of the accuracy of many algorithms on standard test sets.

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments

This work is the first to explore transductive few- shot video classification, and introduces a few-shot video classification framework that leverages the above appearance and temporal similarity scores across multiple steps, namely prototype-based training and testing as well as inductive and transductives prototype refinement.

Unsupervised Action Segmentation by Joint Representation Learning and Online Clustering

A novel approach for unsupervised activity segmentation which uses video frame clustering as a pretext task and simultaneously performs representation learning and online clustering, and incorporates a temporal regularization term into the standard optimal transport module for computing pseudo-label cluster assignments.

Image Stitching and Rectification for Hand-Held Cameras

A new differential homography that can account for the scanline-varying camera poses in Rolling Shutter (RS) cameras is derived, and its application to carry out RS-aware image stitching and rectification at one stroke is demonstrated.



Spatio-Temporal AutoEncoder for Video Anomaly Detection

A novel model called Spatio-Temporal AutoEncoding (ST AutoEncoder or STAE), which utilizes deep neural networks to learn video representation automatically and extracts features from both spatial and temporal dimensions by performing 3-dimensional convolutions, which enhances the motion feature learning in videos.

Real-World Anomaly Detection in Surveillance Videos

  • Waqas SultaniChen ChenM. Shah
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
The experimental results show that the MIL method for anomaly detection achieves significant improvement on anomaly detection performance as compared to the state-of-the-art approaches, and the results of several recent deep learning baselines on anomalous activity recognition are provided.

Future Frame Prediction for Anomaly Detection - A New Baseline

This paper proposes to tackle the anomaly detection problem within a video prediction framework by introducing a motion (temporal) constraint in video prediction by enforcing the optical flow between predicted frames and ground truth frames to be consistent, and is the first work that introduces a temporal constraint into the video prediction task.

Training Adversarial Discriminators for Cross-Channel Abnormal Event Detection in Crowds

Generative Adversarial Nets (GANs), which are trained to generate only the normal distribution of the data, are proposed, which outperforms previous state-of-the-art methods in both the frame-level and the pixel-level evaluation.

Abnormal event detection in videos using generative adversarial nets

Experimental results on challenging abnormality detection datasets show the superiority of the proposed method compared to the state of the art in both frame-level and pixel-level abnormality Detection tasks.

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection

This work proposes Appearance and Motion DeepNet (AMDN) which utilizes deep neural networks to automatically learn feature representations, and introduces a novel double fusion framework, combining both the benefits of traditional early fusion and late fusion strategies.

Abnormal Event Detection at 150 FPS in MATLAB

An efficient sparse combination learning framework based on inherent redundancy of video structures achieves decent performance in the detection phase without compromising result quality and reaches high detection rates on benchmark datasets at a speed of 140-150 frames per second on average.

Deep Supervision with Intermediate Concepts

This work explores an approach for injecting prior domain structure into neural network training by supervising hidden layers of a CNN with intermediate concepts that normally are not observed in practice, and formulates a probabilistic framework which formalizes these notions and predicts improved generalization via this deep supervision method.

Learning Structure-And-Motion-Aware Rolling Shutter Correction

This paper first makes a theoretical contribution by showing that RS two-view geometry is degenerate in the case of pure translational camera motion, and proposes a Convolutional Neural Network (CNN)-based method which learns the underlying geometry from just a single RS image and performs RS image correction.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.