Value of Temporal Dynamics Information in Driving Scene Segmentation

@article{Ding2022ValueOT,
  title={Value of Temporal Dynamics Information in Driving Scene Segmentation},
  author={Li Ding and Jack Terwilliger and Rini Sherony and B. Reimer and Alex Fridman},
  journal={IEEE Transactions on Intelligent Vehicles},
  year={2022},
  volume={7},
  pages={113-122}
}
Semantic scene segmentation has primarily been addressed by forming high-level visual representations of single images. The problem of semantic segmentation in dynamic scenes has begun to receive attention with the video object segmentation and tracking problem. While there has been some recent work attempt to use deep learning models on the video level, what is not known is how the temporal dynamics information is contributing to the full scene segmentation. Moreover, most existing datasets… 

Figures and Tables from this paper

Temporal information integration for video semantic segmentation
TLDR
A temporal Bayesian filter for semantic segmentation of a video sequence using a datadriven prediction function derived from a dense optical flow between images t and t + 1 achieved by a deep neural network is presented.
ASAP-Net: Attention and Structure Aware Point Cloud Sequence Segmentation
TLDR
This paper improves spatio-temporal point cloud feature learning with a flexible module called ASAP considering both attention and structure information across frames, which they find as two important factors for successful segmentation in dynamic point clouds.
MIT DriveSeg (Manual) Dataset for Dynamic Driving Scene Segmentation
This technical report summarizes and provides more detailed information about the MIT DriveSeg dataset [3], including technical aspects in data collection, annotation, and potential research
Applying quantum approximate optimization to the heterogeneous vehicle routing problem
Quantum computing offers new heuristics for combinatorial problems. With small- and intermediate-scale quantum devices becoming available, it is possible to implement and test these heuristics on

References

SHOWING 1-10 OF 59 REFERENCES
Video Object Segmentation without Temporal Information
TLDR
Semantic One-Shot Video Object Segmentation is presented, based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot).
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes
TLDR
This work proposes a novel ResNet-like architecture that exhibits strong localization and recognition performance, and combines multi-scale context with pixel-level accuracy by using two processing streams within the network.
Video segmentation by tracing discontinuities in a trajectory embedding
TLDR
This work proposes a novel embedding discretization process that recovers from over-fragmentations by merging clusters according to discontinuity evidence along inter-cluster boundaries, and presents experimental results of the method that outperform the state-of-the-art in challenging motion segmentation datasets.
Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation
TLDR
This work shows how to improve semantic segmentation through the use of contextual information, specifically, ' patch-patch' context between image regions, and 'patch-background' context, and formulate Conditional Random Fields with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches.
Learning Video Object Segmentation with Visual Memory
TLDR
A novel two-stream neural network with an explicit memory module to achieve the task of segmenting moving objects in unconstrained videos and provides an extensive ablative analysis to investigate the influence of each component in the proposed framework.
One-Shot Video Object Segmentation
TLDR
One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot).
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
TLDR
This work builds a new large-scale video object segmentation dataset called YouTube Video Object Segmentation dataset (YouTube-VOS) and proposes a novel sequence-to-sequence network to fully exploit long-term spatial-temporal information in videos for segmentation.
A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation
TLDR
This work presents a new benchmark dataset and evaluation methodology for the area of video object segmentation, named DAVIS (Densely Annotated VIdeo Segmentation), and provides a comprehensive analysis of several state-of-the-art segmentation approaches using three complementary metrics.
Saliency-aware geodesic video object segmentation
TLDR
This work introduces an unsupervised, geodesic distance based, salient video object segmentation method that incorporates saliency as prior for object via the computation of robust geodesIC measurement and builds global appearance models for foreground and background.
Key-segments for video object segmentation
TLDR
The method first identifies object-like regions in any frame according to both static and dynamic cues and compute a series of binary partitions among candidate “key-segments” to discover hypothesis groups with persistent appearance and motion.
...
...