Real Time Monocular Vehicle Velocity Estimation using Synthetic Data

  title={Real Time Monocular Vehicle Velocity Estimation using Synthetic Data},
  author={Robert McCraith and Luk{\'a}s Neumann and Andrea Vedaldi},
  journal={2021 IEEE Intelligent Vehicles Symposium (IV)},
Vision is one of the primary sensing modalities in autonomous driving. In this paper we look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car. Contrary to prior methods that train end-to-end deep networks that estimate the vehicles' velocity from the video pixels, we propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity from… Expand

Figures and Tables from this paper


Camera-based vehicle velocity estimation from monocular video
It is found that light-weight trajectory based features outperform depth and motion cues extracted from deep ConvNets, especially for far-distance predictions where current disparity and optical flow estimators are challenged significantly. Expand
Unsupervised Learning of Depth and Ego-Motion from Video
Empirical evaluation demonstrates the effectiveness of the unsupervised learning framework for monocular depth performs comparably with supervised methods that use either ground-truth pose or depth for training, and pose estimation performs favorably compared to established SLAM systems under comparable input settings. Expand
Supervising the New with the Old: Learning SFM from SFM
This paper proposes a probabilistic learning formulation where the network predicts distributions over variables rather than specific values, and shows that this formulation can learn and account for the defects of SFM, helping to integrate different sources of information and boosting the overall performance of the network. Expand
Fully-Convolutional Siamese Networks for Object Tracking
A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks. Expand
VirtualWorlds as Proxy for Multi-object Tracking Analysis
This work proposes an efficient real-to-virtual world cloning method, and validate the approach by building and publicly releasing a new video dataset, called "Virtual KITTI", automatically labeled with accurate ground truth for object detection, tracking, scene and instance segmentation, depth, and optical flow. Expand
DeMoN: Depth and Motion Network for Learning Monocular Stereo
This work trains a convolutional network end-to-end to compute depth and camera motion from successive, unconstrained image pairs, and in contrast to the popular depth-from-single-image networks, DeMoN learns the concept of matching and better generalizes to structures not seen during training. Expand
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
The concept of end-to-end learning of optical flow is advanced and it work really well, and faster variants that allow optical flow computation at up to 140fps with accuracy matching the original FlowNet are presented. Expand
CNN-based multi-frame IMO detection from a monocular camera
This paper presents a method for detecting independently moving objects (IMOs) from a monocular camera mounted on a moving car, and evaluates the performance of the method on the KITTI dataset, focusing on sub-sequences containing IMOs. Expand
Learning to Track at 100 FPS with Deep Regression Networks
This work proposes a method for offline training of neural networks that can track novel objects at test-time at 100 fps, which is significantly faster than previous methods that use neural networks for tracking, which are typically very slow to run and not practical for real-time applications. Expand
Instantaneous lateral velocity estimation of a vehicle using Doppler radar
A robust and model-free approach to determine the velocity vector of an extended target that can handle noise and systematic variations in the signal and is optimized to deal with measurement errors of the radar sensor not only in the radial velocity, but in the azimuth position. Expand