Warp-Refine Propagation: Semi-Supervised Auto-labeling via Cycle-consistency

  title={Warp-Refine Propagation: Semi-Supervised Auto-labeling via Cycle-consistency},
  author={Aditya Ganeshan and Alexis Vallet and Yasunori Kudo and Shin-ichi Maeda and Tommi Kerola and Rares Ambrus and Dennis Park and Adrien Gaidon},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
Deep learning models for semantic segmentation rely on expensive, large-scale, manually annotated datasets. Labelling is a tedious process that can take hours per image. Automatically annotating video sequences by propagating sparsely labeled frames through time is a more scalable alternative. In this work, we propose a novel label propagation method, termed Warp-Refine Propagation, that combines semantic cues with geometric cues to efficiently auto-label videos. Our method learns to refine… 

Figures and Tables from this paper

Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation
Experimental results show that Panoptic NeRF outperforms existing semantic and instance label transfer methods in terms of accuracy and multi-view consistency on challenging urban scenes of the KITTI-360 dataset.


Improving Semantic Segmentation via Video Propagation and Label Relaxation
This paper presents a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks, and introduces a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries.
Can Ground Truth Label Propagation from Video Help Semantic Segmentation?
This work performs a systematic analysis to find the right kind of PGT that needs to be added along with the GT for training a CNN and concludes that PGT which is diverse from GT images and has good quality of labeling can indeed help improve the performance of a CNN.
Semantics through Time: Semi-supervised Segmentation of Aerial Videos with Iterative Label Propagation
This paper introduces SegProp, a novel iterative flow-based method, with a direct connection to spectral clustering in space and time, to propagate the semantic labels to frames that lack human annotations, significantly outperforming other state-of-the-art label propagation methods.
Semantic Video CNNs Through Representation Warping
A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training.
Large Scale Labelled Video Data Augmentation for Semantic Segmentation in Driving Scenarios
This work makes use of an occlusion-aware and uncertainty-enabled label propagation algorithm to generate additional labelled data and increases the availability of high-resolution labelled frames by a factor of 20, yielding in a 6.8% to 10.
Efficient Video Semantic Segmentation with Labels Propagation and Refinement
The proposed Efficient Video Segmentation (EVS) pipeline achieves accuracy levels competitive to the existing real-time methods for semantic image segmentation (mIoU above 60%), while achieving much higher frame rates.
LT-Net: Label Transfer by Learning Reversible Voxel-Wise Correspondence for One-Shot Medical Image Segmentation
A one-shot segmentation method to alleviate the burden of manual annotation for medical images by resorting to the forward-backward consistency, which is widely used in correspondence problems, and additionally learns the backward correspondences from the warped atlases back to the original atlas.
Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-training
This paper proposes a novel UDA framework based on an iterative self-training (ST) procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels.
Hierarchical Multi-Scale Attention for Semantic Segmentation
This work presents an attention-based approach to combining multi-scale predictions, and shows that predictions at certain scales are better at resolving particular failures modes, and that the network learns to favor those scales for such cases in order to generate better predictions.
Label propagation in video sequences
This paper proposes a probabilistic graphical model for the problem of propagating labels in video sequences, also termed the label propagation problem, and reports studies on a state of the art Random forest classifier based video segmentation scheme, trained using fully ground truth data and with data obtained from label propagation.