No More Discrimination: Cross City Adaptation of Road Scene Segmenters

@article{Chen2017NoMD,
  title={No More Discrimination: Cross City Adaptation of Road Scene Segmenters},
  author={Yi-Hsin Chen and Wei-Yu Chen and Yu-Ting Chen and Bo-Cheng Tsai and Y. Wang and Min Sun},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={2011-2020}
}
Despite the recent success of deep-learning based semantic segmentation, deploying a pre-trained road scene segmenter to a city whose images are not presented in the training set would not achieve satisfactory performance due to dataset biases. Instead of collecting a large number of annotated images of each city of interest to train or refine the segmenter, we propose an unsupervised learning approach to adapt road scene segmenters across different cities. By utilizing Google Street View and… 

Figures and Tables from this paper

Large Scale Unsupervised Domain Adaptation of Segmentation Networks with Adversarial Learning
TLDR
The core concept is to infuse fully convolutional neural networks and adversarial networks for semantic segmentation assuming the structures in the scene and objects of interest are similar in two set of images.
Learning to Adapt Structured Output Space for Semantic Segmentation
TLDR
A multi-level adversarial network is constructed to effectively perform output space domain adaptation at different feature levels and it is shown that the proposed method performs favorably against the state-of-the-art methods in terms of accuracy and visual quality.
An Adversarial Self-Learning Method for Cross-City Adaptation in Semantic Segmentation
TLDR
With the Cityscapes to NTHU cross-city adaptation experiments, the adversarial self-learning method can achieve stateof-the-art results compared with the domain adaptation methods proposed in the recent researches.
DISCRIMINATIVE PATCH REPRESENTATIONS
TLDR
A domain adaptation method to adapt the source data to the unlabeled target domain and use an adversarial learning scheme to push the feature representations in target patches to the closer distributions in source ones to achieve state-of-the-art performance on semantic segmentation.
Road Scenes Segmentation Across Different Domains by Disentangling Latent Representations
TLDR
This work design and carefully analyze multiple latent spaceshaping regularization strategies that work together to reduce the domain shift, and proposes a novel evaluation metric to capture the relative performance of an adapted model with respect to supervised training.
SSF-DAN: Separated Semantic Feature Based Domain Adaptation Network for Semantic Segmentation
TLDR
A Separated Semantic Feature based domain adaptation network, named SSF-DAN, for semantic segmentation, designed to independently adapt semantic features across the target and source domains and demonstrates robust performance, superior to state-of-the-art methods on benchmark datasets.
A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes
TLDR
This work proposes a curriculum-style learning approach to minimizing the domain gap in urban scene semantic segmentation, which outperforms the baselines on two datasets and three backbone networks.
Domain Adaptive Semantic Segmentation Through Structure Enhancement
TLDR
This work proposes an effective method to adapt the segmentation network trained on synthetic images to real scenarios in an unsupervised fashion and enhances the structure information of the target images at both the feature level and the output level to improve the adaptation performance for semantic segmentation.
Domain Adaptation With Foreground/Background Cues and Gated Discriminators
TLDR
This work proposes a mask-aware gated discriminator that learns soft masks from the input foreground and background masks instead of naively performing binary masking that immediately removes information outside of the predicted masks.
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
TLDR
This work proposes a new reality oriented adaptation approach for urban scene semantic segmentation by learning from synthetic data that takes advantage of the intrinsic spatial structure presented in urban scene images, and proposes a spatial-aware adaptation scheme to effectively align the distribution of two domains.
...
...

References

SHOWING 1-10 OF 45 REFERENCES
FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation
TLDR
This paper introduces the first domain adaptive semantic segmentation method, proposing an unsupervised adversarial approach to pixel prediction problems, and outperforms baselines across different settings on multiple large-scale datasets.
The Cityscapes Dataset for Semantic Urban Scene Understanding
TLDR
This work introduces Cityscapes, a benchmark suite and large-scale dataset to train and test approaches for pixel-level and instance-level semantic labeling, and exceeds previous attempts in terms of dataset size, annotation richness, scene variability, and complexity.
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes
TLDR
This paper generates a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations, and conducts experiments with DCNNs that show how the inclusion of SYnTHIA in the training stage significantly improves performance on the semantic segmentation task.
Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation
TLDR
Expectation-Maximization (EM) methods for semantic image segmentation model training under weakly supervised and semi-supervised settings are developed and extensive experimental evaluation shows that the proposed techniques can learn models delivering competitive results on the challenging PASCAL VOC 2012 image segmentsation benchmark, while requiring significantly less annotation effort.
From image-level to pixel-level labeling with Convolutional Networks
TLDR
A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.
Semantic Instance Annotation of Street Scenes by 3D to 2D Label Transfer
TLDR
This paper annotates static 3D scene elements with rough bounding primitives and develops a model which transfers this information into the image domain and reveals that 3D information enables more efficient annotation while at the same time resulting in improved accuracy and time-coherent labels.
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
TLDR
This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).
Fully Convolutional Multi-Class Multiple Instance Learning
TLDR
This work proposes a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network that exploits the further supervision given by images with multiple labels.
Learning Transferrable Representations for Unsupervised Domain Adaptation
TLDR
A unified deep learning framework where the representation, cross domain transformation, and target label inference are all jointly optimized in an end-to-end fashion for unsupervised domain adaptation is proposed.
Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation
TLDR
This work proposes a novel method to extract markedly more accurate masks from the pre-trained network itself, forgoing external objectness modules, and introduces a new form of inexpensive weak supervision yielding an additional accuracy boost.
...
...