Recursive Training for Zero-Shot Semantic Segmentation

  title={Recursive Training for Zero-Shot Semantic Segmentation},
  author={Ce Wang and Moshiur Rahman Farazi and Nick Barnes},
  journal={2021 International Joint Conference on Neural Networks (IJCNN)},
  • Ce Wang, M. Farazi, N. Barnes
  • Published 26 February 2021
  • Computer Science
  • 2021 International Joint Conference on Neural Networks (IJCNN)
General purpose semantic segmentation relies on a backbone CNN network to extract discriminative features that help classify each image pixel into a ‘seen’ object class (i.e., the object classes available during training) or a background class. Zero-shot semantic segmentation is a challenging task that requires a computer vision model to identify image pixels belonging to an object class which it has never seen before. Equipping a general purpose semantic segmentation model to separate image… 

Figures and Tables from this paper


The Role of Context for Object Detection and Semantic Segmentation in the Wild
A novel deformable part-based model is proposed, which exploits both local context around each candidate detection as well as global context at the level of the scene, which significantly helps in detecting objects at all scales.
Zero-Shot Semantic Segmentation
A novel architecture, ZS3Net, combining a deep visual segmentation model with an approach to generate visual representations from semantic word embeddings is presented, addressing pixel classification tasks where both seen and unseen categories are faced at test time (so called "generalized" zero-shot classification).
Generative Moment Matching Networks
This work forms a method that generates an independent sample via a single feedforward pass through a multilayer perceptron, as in the recently proposed generative adversarial networks, using MMD to learn to generate codes that can then be decoded to produce samples.
The Pascal Visual Object Classes Challenge: A Retrospective
A review of the Pascal Visual Object Classes challenge from 2008-2012 and an appraisal of the aspects of the challenge that worked well, and those that could be improved in future challenges.
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.
Encoderdecoder with atrous separable convolution for semantic image segmentation
  • Proceedings of the European conference on computer vision (ECCV), 2018, pp. 801–818.
  • 2018
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
U-Net: Convolutional Networks for Biomedical Image Segmentation
It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Large-Scale Machine Learning with Stochastic Gradient Descent
A more precise analysis uncovers qualitatively different tradeoffs for the case of small-scale and large-scale learning problems.
Rethinking Pre-training and Self-training
Self-training works well exactly on the same setup that pre-training does not work (using ImageNet to help COCO), and on the PASCAL segmentation dataset, though pre- training does help significantly, self-training improves upon the pre-trained model.