Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation

@article{Saleh2016BuiltinFP,
  title={Built-in Foreground/Background Prior for Weakly-Supervised Semantic Segmentation},
  author={Fatemeh Sadat Saleh and Mohammad Sadegh Ali Akbarian and Mathieu Salzmann and Lars Petersson and Stephen Gould and Jos{\'e} Manuel {\'A}lvarez},
  journal={ArXiv},
  year={2016},
  volume={abs/1609.00446}
}
Pixel-level annotations are expensive and time consuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recently, CNN-based methods have proposed to fine-tune pre-trained networks using image tags. Without additional information, this leads to poor localization accuracy. This problem, however, was alleviated by making use of objectness priors to generate foreground/background masks. Unfortunately these priors either require… 
Incorporating Network Built-in Priors in Weakly-Supervised Semantic Segmentation
TLDR
This work proposes a novel method to extract accurate masks from networks pre-trained for the task of object recognition, thus forgoing external objectness modules, and shows how foreground/background masks can be obtained from the activations of higher-level convolutional layers of a network.
Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation
TLDR
A novel pooling method is proposed, dubbed background-aware pooling (BAP), that focuses more on aggregating foreground features inside the bounding boxes using attention maps, and introduces a noise-aware loss (NAL) that makes the networks less susceptible to incorrect labels.
Bringing Background into the Foreground: Making All Classes Equal in Weakly-Supervised Video Semantic Segmentation
TLDR
This paper introduces an approach to weak supervision using only image tags by making use of classifier heatmaps, and develops a two-stream deep architecture that jointly leverages appearance and motion, and design a loss based on the authors' heatmaps to train it.
Learning to Exploit the Prior Network Knowledge for Weakly Supervised Semantic Segmentation
TLDR
This paper introduces a novel weakly supervised semantic segmentation model which is able to learn from image labels and just image labels, and generates accurate class-specific segmentation masks from these regions, where neither external objectness nor saliency algorithms are required.
Coarse-to-Fine Semantic Segmentation From Image-Level Labels
TLDR
This paper proposes a novel recursive coarse-to-fine semantic segmentation framework based on only image-level category labels that can be easily extended to foreground object segmentation task and achieves comparable performance with the state-of-the-art supervised methods on the Internet object dataset.
Exploiting Saliency for Object Segmentation from Image Level Labels
TLDR
This paper proposes using a saliency model as additional information and hereby exploit prior knowledge on the object extent and image statistics and shows how to combine both information sources in order to recover 80% of the fully supervised performance of pixel-wise semantic labelling.
PCAMs: Weakly Supervised Semantic Segmentation Using Point Supervision
TLDR
This paper presents a novel procedure for producing semantic segmentation from images given some point level annotations, which includes point annotations in the training of a convolutional neural network for producing improved localization and class activation maps and training a CNN that is normally fully supervised using pseudo labels in place of ground truth labels.
Learning to Segment With Image-Level Supervision
TLDR
This paper proposes a model that generates auxiliary labels for each image, while simultaneously forcing the output of the CNN to satisfy the mean-field constraints imposed by a conditional random field, and achieves the state-of-the-art for weakly supervised semantic image segmentation on VOC 2012 dataset.
STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation
TLDR
A simple to complex (STC) framework in which only image-level annotations are utilized to learn DCNNs for semantic segmentation, which demonstrates the superiority of the proposed STC framework compared with other state-of-the-arts frameworks.
...
...

References

SHOWING 1-10 OF 41 REFERENCES
BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation
  • Jifeng Dai, Kaiming He, Jian Sun
  • Computer Science
    2015 IEEE International Conference on Computer Vision (ICCV)
  • 2015
TLDR
This paper proposes a method that achieves competitive accuracy but only requires easily obtained bounding box annotations, and yields state-of-the-art results on PASCAL VOC 2012 and PASCal-CONTEXT.
Learning to segment with image-level annotations
STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation
TLDR
A simple to complex (STC) framework in which only image-level annotations are utilized to learn DCNNs for semantic segmentation, which demonstrates the superiority of the proposed STC framework compared with other state-of-the-arts frameworks.
From image-level to pixel-level labeling with Convolutional Networks
TLDR
A Convolutional Neural Network-based model is proposed, which is constrained during training to put more weight on pixels which are important for classifying the image, and which beats the state of the art results in weakly supervised object segmentation task by a large margin.
Learning to segment under various forms of weak supervision
TLDR
This work proposes a unified approach that incorporates various forms of weak supervision - image level tags, bounding boxes, and partial labels - to produce a pixel-wise labeling on the challenging Siftflow dataset.
What's the Point: Semantic Segmentation with Point Supervision
TLDR
This work takes a natural step from image-level annotation towards stronger supervision: it asks annotators to point to an object if one exists, and incorporates this point supervision along with a novel objectness potential in the training loss function of a CNN model.
Weakly- and Semi-Supervised Learning of a DCNN for Semantic Image Segmentation
TLDR
Expectation-Maximization (EM) methods for semantic image segmentation model training under weakly supervised and semi-supervised settings are developed and extensive experimental evaluation shows that the proposed techniques can learn models delivering competitive results on the challenging PASCAL VOC 2012 image segmentsation benchmark, while requiring significantly less annotation effort.
Fully Convolutional Multi-Class Multiple Instance Learning
TLDR
This work proposes a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network that exploits the further supervision given by images with multiple labels.
Weakly supervised semantic segmentation with a multi-image model
TLDR
A novel method for weakly supervised semantic segmentation using a multi-image model (MIM) - a graphical model for recovering the pixel labels of the training images and introducing an “objectness” potential, that helps separating objects from background classes.
Constrained Convolutional Neural Networks for Weakly Supervised Segmentation
TLDR
This work proposes Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space of a CNN, and demonstrates the generality of this new learning framework.
...
...