Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

@article{Wang2020SelfSupervisedEA,
  title={Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation},
  author={Yude Wang and Jie Zhang and Meina Kan and S. Shan and Xilin Chen},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={12272-12281}
}
  • Yude Wang, Jie Zhang, Xilin Chen
  • Published 9 April 2020
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Image-level weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years. Most of advanced solutions exploit class activation map (CAM). However, CAMs can hardly serve as the object mask due to the gap between full and weak supervisions. In this paper, we propose a self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap. Our method is based on the observation that equivariance is an implicit… 
Weakly supervised segmentation with cross-modality equivariant constraints
Learning structure-aware semantic segmentation with image-level supervision
TLDR
This paper argues that the lost structure information in CAM limits its application in downstream semantic segmentation, leading to deteriorated predictions, and introduces an auxiliary semantic boundary detection module, which penalizes the deteriorated predictions.
Exploring Pixel-level Self-supervision for Weakly Supervised Semantic Segmentation
TLDR
A novel framework that derives pixel-level selfsupervision from given image-level supervision that shows state-of-the-art WSSS performance both on the train and validation sets on the PASCAL VOC 2012 dataset is proposed.
Railroad is not a Train: Saliency as Pseudo-pixel Supervision for Weakly Supervised Semantic Segmentation
TLDR
A novel framework, namely Explicit Pseudo-pixel Supervision (EPS), which learns from pixel-level feedback by combining two weak supervisions; the image-level label provides the object identity via the localization map and the saliency map from the off-the-shelf saliency detection model offers rich boundaries.
Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast
TLDR
Weakly-supervised pixel-to-prototype contrast that can provide pixel-level supervisory signals to narrow the gap between classification and segmentation is proposed and seamlessly incorporated into existing WSSS models without any changes to the base networks and does not incur any extra inference burden.
Self-supervised Image-specific Prototype Exploration for Weakly Supervised Semantic Segmentation
TLDR
Results show the proposed SIPE achieves new state-of-the-art performance using only image-level labels on PASCAL VOC 2012 and MS COCO 2014 segmentation benchmark, and further optimizes the feature representation and empowers a self-correction ability of prototype exploration.
ECS-Net: Improving Weakly Supervised Semantic Segmentation by Using Connections Between Class Activation Maps
TLDR
This work uses relationships between CAMs to propose a novel weakly supervised method, Erased CAM Supervision Net (ECS-Net), which generates pixel-level labels by predicting segmentation results of those processed images, outperforming previous state-of-the-art methods.
Puzzle-CAM: Improved Localization Via Matching Partial And Full Features
  • Sanhyun Jo, In-Jae Yu
  • Computer Science
    2021 IEEE International Conference on Image Processing (ICIP)
  • 2021
TLDR
Puzzle-CAM, a process that minimizes differences between the features from separate patches and the whole image to discover the most integrated region in an object, can activate the overall region of an object using image-level supervision without requiring extra parameters.
Complementary Patch for Weakly Supervised Semantic Segmentation
TLDR
A novel Complementary Patch (CP) Representation is proposed and it is proved that the information of the sum of the CAMs by a pair of input images with complementary hidden (patched) parts is greater than or equal to theInformation of the baseline CAM.
Weakly-Supervised Image Semantic Segmentation Using Graph Convolutional Networks
TLDR
This work forms the generation of complete pseudo labels as a semi-supervised learning task and learns a 2-layer GCN separately for every training image by back-propagating a Laplacian and an entropy regularization loss.
...
...

References

SHOWING 1-10 OF 40 REFERENCES
Learning Pixel-Level Semantic Affinity with Image-Level Supervision for Weakly Supervised Semantic Segmentation
  • Jiwoon Ahn, Suha Kwak
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
TLDR
On the PASCAL VOC 2012 dataset, a DNN learned with segmentation labels generated by the method outperforms previous models trained with the same level of supervision, and is even as competitive as those relying on stronger supervision.
CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation
TLDR
This paper proposes an end-to-end cross- image affinity module, which exploits pixel-level cross-image relationships with only image-level labels and achieves a new state-of-the-art result by only using image- level labels for weakly supervised semantic segmentation.
Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation
TLDR
It is found that varying dilation rates can effectively enlarge the receptive fields of convolutional kernels and more importantly transfer the surrounding discriminative information to non-discriminative object regions, promoting the emergence of these regions in the object localization maps.
Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations
TLDR
IRNet is proposed, which estimates rough areas of individual instances and detects boundaries between different object classes and enables to assign instance labels to the seeds and to propagate them within the boundaries so that the entire areas of instances can be estimated accurately.
STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation
TLDR
A simple to complex (STC) framework in which only image-level annotations are utilized to learn DCNNs for semantic segmentation, which demonstrates the superiority of the proposed STC framework compared with other state-of-the-arts frameworks.
Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation
TLDR
It is shown that properly combining saliency and attention maps allows for reliable cues capable of significantly boosting the performance, and a simple yet powerful hierarchical approach to discover the class-agnostic salient regions, obtained using a salient object detector, is proposed.
Weakly-and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation
TLDR
Expectation-Maximization (EM) methods for semantic image segmentation model training under weakly supervised and semi-supervised settings are developed and extensive experimental evaluation shows that the proposed techniques can learn models delivering competitive results on the challenging PASCAL VOC 2012 image segmentsation benchmark, while requiring significantly less annotation effort.
Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing
TLDR
This paper proposes to train a semantic segmentation network starting from the discriminative regions and progressively increase the pixel-level supervision using by seeded region growing, and obtains the state-of-the-art performance.
Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features
TLDR
An iterative bottom-up and top-down framework which alternatively expands object regions and optimizes segmentation network and outperforms previous state-of-the-art methods by a large margin is proposed.
From image-level to pixel-level labeling with Convolutional Networks
TLDR
A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.
...
...