Corpus ID: 236318267

Domain Adaptive Video Segmentation via Temporal Consistency Regularization

  title={Domain Adaptive Video Segmentation via Temporal Consistency Regularization},
  author={Dayan Guan and Jiaxing Huang and Aoran Xiao and Shijian Lu},
Video semantic segmentation is an essential task for the analysis and understanding of videos. Recent efforts largely focus on supervised video segmentation by learning from fully annotated data, but the learnt models often experience clear performance drop while applied to videos of a different domain. This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain… Expand

Figures and Tables from this paper

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking
RDA, a robust domain adaptation technique that introduces adversarial attacking to mitigate overfitting in UDA, is presented and extensive experiments over multiple domain adaptation tasks show that RDA can work with different computer vision tasks with superior performance. Expand
Model Adaptation: Historical Contrastive Learning for Unsupervised Domain Adaptation without Source Data
An innovative historical contrastive learning (HCL) technique that exploits historical source hypothesis to make up for the absence of source data in unsupervised model adaptation (UMA). Expand


Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
This paper proposes a novel framework for joint video semantic segmentation and optical flow estimation that is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Expand
Action Segmentation with Mixed Temporal Domain Adaptation
Mixed Temporal Domain Adaptation is proposed to jointly align frame-and video-level embedded feature spaces across domains, and further integrate with the domain attention mechanism to focus on aligning the frame-level features with higher domain discrepancy, leading to more effective domain adaptation. Expand
Semantic Video Segmentation by Gated Recurrent Flow Propagation
A deep, end-to-end trainable methodology for video segmentation that is capable of leveraging the information present in unlabeled data, besides sparsely labeled frames, in order to improve semantic estimates. Expand
Shuffle and Attend: Video Domain Adaptation
This work proposes an attention mechanism which focuses on more discriminative clips and directly optimizes for video-level alignment and proposes to use the clip order prediction as an auxiliary task, which encourages learning of representations which focus on the humans and objects involved in the actions. Expand
Improving Semantic Segmentation via Video Propagation and Label Relaxation
This paper presents a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks, and introduces a novel boundary label relaxation technique that makes training robust to annotation noise and propagation artifacts along object boundaries. Expand
Temporal Attentive Alignment for Large-Scale Video Domain Adaptation
This work proposes Temporal Attentive Adversarial Adaptation Network (TA3N), which explicitly attends to the temporal dynamics using domain discrepancy for more effective domain alignment, achieving state-of-the-art performance on four video DA datasets. Expand
Domain Adaptation for Structured Output via Discriminative Patch Representations
A domain adaptation method to adapt the source data to the unlabeled target domain by discovering multiple modes of patch-wise output distribution through the construction of a clustered space and using an adversarial learning scheme to push the feature representations of target patches in the clustered space closer to the distributions of source patches. Expand
Action Segmentation With Joint Self-Supervised Temporal Domain Adaptation
SelfSupervised Temporal Domain Adaptation (SSTDA), which contains two self-supervised auxiliary tasks (binary and sequential domain prediction) to jointly align cross-domain feature spaces embedded with local and global temporal dynamics, achieving better performance than other Domainadaptation (DA) approaches. Expand
Low-Latency Video Semantic Segmentation
A framework for video semantic segmentation is developed, which incorporates two novel components: a feature propagation module that adaptively fuses features over time via spatially variant convolution, thus reducing the cost of per-frame computation and an adaptive scheduler that dynamically allocate computation based on accuracy prediction. Expand
Scale variance minimization for unsupervised domain adaptation in image segmentation
A scale variance minimization method that introduces certain supervision in the target domain by imposing a scale-invariance constraint while learning to segment an image and its scale-transformation concurrently and achieves superior domain adaptive segmentation performance as compared with the state-of-the-art. Expand