ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning

@article{Olsson2021ClassMixSD,
  title={ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning},
  author={Viktor Olsson and Wilhelm Tranheden and Juliano Pinto and Lennart Svensson},
  journal={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2021},
  pages={1368-1377}
}
The state of the art in semantic segmentation is steadily increasing in performance, resulting in more precise and reliable segmentations in many different applications. However, progress is limited by the cost of generating labels for training, which sometimes requires hours of manual labor for a single image. Because of this, semi-supervised methods have been applied to this task, with varying degrees of success. A key challenge is that common augmentations used in semi-supervised… 

Figures and Tables from this paper

A Simple Baseline for Semi-supervised Semantic Segmentation with Strong Data Augmentation*
TLDR
It is demonstrated that the devil is in the details: a set of simple designs and training techniques can collectively improve the performance of semi-supervised semantic segmentation significantly.
Mask-based Data Augmentation for Semi-supervised Semantic Segmentation
TLDR
This paper proposes a new approach for data augmentation, termed ComplexMix, which incorporates aspects of CutMix and ClassMix with improved performance and has the ability to control the complexity of the augmented data while attempting to be semantically-correct and address the tradeoff between complexity and correctness.
Contrastive Learning for Label Efficient Semantic Segmentation
TLDR
A simple and effective contrastive learning-based training strategy in which the network is pretrain the network using a pixel-wise, label-based contrastive loss, and then fine-tune it using the cross-entropy loss, which increases intra-class compactness and inter-class separability, thereby resulting in a better pixel classifier.
Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation
TLDR
This work proposes a framework for semi-supervised semantic segmentation, which is enhanced by self- supervised monocular depth estimation from unlabeled image sequences, and implements a strong data augmentation by blending images and labels using the geometry of the scene.
A Three-Stage Self-Training Framework for Semi-Supervised Semantic Segmentation
TLDR
This work proposes a holistic solution framed as a self-training framework for semi-supervised semantic segmentation that decreases the uncertainty of the pseudo-mask by using a multi-task model that enforces consistency and that exploits the rich statistical information of the data.
Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank
TLDR
The key element of this approach is the contrastive learning module that enforces the segmentation network to yield similar pixel-level feature representations for same-class samples across the whole dataset, maintaining a memory bank which is continuously updated with relevant and high-quality feature vectors from labeled data.
Bootstrapping Semantic Segmentation with Regional Contrast
TLDR
ReCo performs semi-supervised or supervised pixel-level contrastive learning on a sparse set of hard negative pixels, with minimal additional memory footprint, and consistently improves performance in both semisupervised and supervised semantic segmentation methods, achieving smoother segmentation boundaries and faster convergence.
Semantic Segmentation with Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization
TLDR
This paper proposes a novel framework for discriminative pixel-level tasks using a generative model of both images and labels that captures the joint image-label distribution and is trained efficiently using a large set of un-labeled images supplemented with only few labeled ones.
The GIST and RIST of Iterative Self-Training for Semi-Supervised Segmentation
TLDR
This work considers the task of semi-supervised semantic segmentation, where it aims to produce pixel-wise semantic object masks given only a small number of human-labeled training examples, and proposes Greedy Iterative Self-Training and Random Iterative self-Training strategies that alternate between training on either human- labeled data or pseudo-labeling data at each refinement stage.
DACS: Domain Adaptation via Cross-domain Mixed Sampling
TLDR
DACS: Domain Adaptation via Cross-domain mixed Sampling, which mixes images from the two domains along with the corresponding labels and pseudo-labels, and achieves state-of-the-art results for GTA5 to Cityscapes, a common synthetic-to-real semantic segmentation benchmark for UDA.
...
...

References

SHOWING 1-10 OF 46 REFERENCES
Semi-supervised semantic segmentation needs strong, varied perturbations
TLDR
This work finds that adapted variants of the recently proposed CutOut and CutMix augmentation techniques yield state-of-the-art semi-supervised semantic segmentation results in standard datasets.
Universal Semi-Supervised Semantic Segmentation
TLDR
This paper poses the novel problem of universal semi-supervised semantic segmentation and proposes a solution framework, to meet the dual needs of lower annotation and deployment costs, and introduces a novel feature alignment objective based on pixel-aware entropy regularization for the latter.
Semi-supervised semantic segmentation needs strong, high-dimensional perturbations
TLDR
This work analyzes the problem of semantic segmentation and finds that the data distribution does not exhibit low density regions separating classes and offers this as an explanation for why semi-supervised segmentation is a challenging problem.
Leveraging Semi-Supervised Learning in Video Sequences for Urban Scene Segmentation
TLDR
The Naive-Student model, trained with such simple yet effective iterative semi-supervised learning, attains state-of-the-art results at all three Cityscapes benchmarks, reaching the performance of 67.8% PQ, 42.6% AP, and 85.2% mIOU on the test set.
Semi-Supervised Semantic Segmentation With High- and Low-Level Consistency
TLDR
This work proposes an approach for semi-supervised semantic segmentation that learns from limited pixel-wise annotated samples while exploiting additional annotation-free images, and achieves significant improvement over existing methods, especially when trained with very few labeled samples.
Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum
TLDR
The method, Dynamic Self-Training and Class-Balanced Curriculum (DST-CBC), exploits inter-model disagreement by prediction confidence to construct a dynamic loss robust against pseudo label noise, enabling it to extend pseudo labeling to a class-balanced curriculum learning process.
Semi-Supervised Semantic Segmentation With Cross-Consistency Training
TLDR
This work observes that for semantic segmentation, the low-density regions are more apparent within the hidden representations than within the inputs, and proposes cross-consistency training, where an invariance of the predictions is enforced over different perturbations applied to the outputs of the encoder.
InstaBoost: Boosting Instance Segmentation via Probability Map Guided Copy-Pasting
TLDR
This paper presents a simple, efficient and effective method to augment the training set using the existing instance mask annotations, and proposes a location probability map based approach to explore the feasible locations that objects can be placed based on local appearance similarity.
Unsupervised Data Augmentation for Consistency Training
TLDR
A new perspective on how to effectively noise unlabeled examples is presented and it is argued that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning.
Milking CowMask for Semi-Supervised Image Classification
TLDR
A novel mask-based augmentation method called CowMask is presented, using it to provide perturbations for semi-supervised consistency regularization, which achieves a state-of-the-art result on ImageNet with 10% labeled data.
...
...