Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

  title={Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?},
  author={Malik Boudiaf and Hoel Kervadec and Imtiaz Masud Ziko and Pablo Piantanida and Ismail Ben Ayed and Jos{\'e} Dolz},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Malik BoudiafH. Kervadec J. Dolz
  • Published 11 December 2020
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
We show that the way inference is performed in few-shot segmentation tasks has a substantial effect on performances—an aspect often overlooked in the literature in favor of the meta-learning paradigm. We introduce a transductive inference for a given query image, leveraging the statistics of its unlabeled pixels, by optimizing a new loss containing three complementary terms: i) the cross-entropy on the labeled support pixels; ii) the Shannon entropy of the posteriors on the unlabeled query… 

Figures and Tables from this paper

HMFS: Hybrid Masking for Few-Shot Segmentation

This work compensates for the loss of fine-grained spatial details in FM technique by investigat-ing and leveraging a complementary basic input masking method, which shows improved performance against the current state-of-the-art methods by visible margins across different benchmarks.

Deep Gaussian Processes for Few-Shot Segmentation

This work proposes a few-shot learner formulation based on Gaussian process (GP) regression that sets a new state-of-theart for 5-shot segmentation, with mIoU scores of 68.1 and 49.8 on PASCAL-5 and COCO-20, respectively.

Cross-domain Few-shot Segmentation with Transductive Fine-tuning

This work proposes to transductively tune the base model on a set of query images under the few-shot setting, where the core idea is to implicitly guide the segmentation of query image using support labels, and shows that this method could con-sistently and significantly improve the performance of prototypical FSS models in all cross-domain tasks.

One-Shot Synthesis of Images and Segmentation Masks

The OSMIS model is introduced, inspired by the recent architectural de-velopments of single-image GANs, which enables the synthesis of segmentation masks that are precisely aligned to the generated images in the one-shot regime, and outperforms state-of-the-art single- image GAN models in image synthesis quality and diversity.

Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

A meta-memory bank is proposed to improve the generalization of the segmentation network by bridging the domain gap between source and target domains and a new contrastive learning strategy is adopted to explore the knowledge of different categories during the training stage.

Rethinking Unsupervised Domain Adaptation for Semantic Segmentation

This work rethink UDA from a data-centric point of view, and asks how many labeled samples are necessary for satisfactory hyper-parameters of existing UDA methods, and conducts experiments to answer these questions with popular scenarios.

Temporal Transductive Inference for Few-Shot Video Object Segmentation

A simple but effective temporal transductive inference approach that leverages temporal consistency in the unlabelled video frames during few-shot inference that outperforms state-of-the-art meta-learning approaches in terms of mean intersection over union on YouTube-VIS.

Hypercorrelation Squeeze for Few-Shot Segmenation

Hypercorrelation Squeeze Networks (HSNet) is proposed that leverages multi-level feature correlation and efficient 4D convolutions to squeeze high-level semantic and low-level geometric cues of the hypercorrelation into precise segmentation masks in coarse-to-fine manner.

CobNet: Cross Attention on Object and Background for Few-Shot Segmentation

This paper proposes CobNet which utilises information about the background that is extracted from the query images without annotations of those images to overcome the issue of limited utility in few-shot segmentation.

Feature-Proxy Transformer for Few-Shot Segmentation

This paper proposes a novel Feature-Proxy Transformer (FPTrans) method, in which the “proxy” is the vector representing a semantic class in the linear classification head, and shows that FPTrans achieves competitive FSS accuracy on par with state-of-the-art decoder-based methods.



Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need?

It is shown that a simple baseline: learning a supervised or self-supervised representation on the meta-training set, followed by training a linear classifier on top of this representation, outperforms state-of-the-art few-shot learning methods.

Transductive Information Maximization For Few-Shot Learning

This work introduces Transductive Infomation Maximization (TIM) for few-shot learning, and proposes a new alternating-direction solver for the mutual-information loss, which substantially speeds up transductive-inference convergence over gradient-based optimization, while yielding similar accuracy.

CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning

Canet is presented, a class-agnostic segmentation network that performs few-shot segmentation on new classes with only a few annotated images available, and introduces an attention mechanism to effectively fuse information from multiple support examples under the setting of k-shot learning.

AMP: Adaptive Masked Proxies for Few-Shot Segmentation

A novel adaptive masked proxies method that constructs the final segmentation layer weights from few labelled samples by utilizing multi-resolution average pooling on base embeddings masked with the label to act as a positive proxy for the new class, while fusing it with the previously learned class signatures.

CRNet: Cross-Reference Networks for Few-Shot Segmentation

This paper proposes a cross-reference network (CRNet) for few-shot segmentation, and develops a mask refinement module to recurrently refine the prediction of the foreground regions in the query image.

Cross Attention Network for Few-shot Classification

A novel Cross Attention Network is introduced to deal with the problem of unseen classes and a transductive inference algorithm is proposed to alleviate the low-data problem, which iteratively utilizes the unlabeled query set to augment the support set, thereby making the class features more representative.

A Closer Look at Few-shot Classification

The results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones, and a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms.

Few-Shot Semantic Segmentation with Democratic Attention Networks

This paper introduces the democratized graph attention mechanism, which can activate more pixels on the object to establish a robust correspondence between support and query images, and proposes multi-scale guidance by designing a refinement fusion unit to fuse features from intermediate layers for the segmentation of the query image.

Feature Weighting and Boosting for Few-Shot Segmentation

Improving discriminativeness of features so their activations are high on the foreground and low elsewhere; and Boosting inference with an ensemble of experts guided with the gradient of loss incurred when segmenting the support images in testing are made.

Prior Guided Feature Enrichment Network for Few-Shot Segmentation

The PFENet consists of novel designs of a training-free prior mask generation method that not only retains generalization power but also improves model performance and Feature Enrichment Module (FEM) that overcomes spatial inconsistency by adaptively enriching query features with support features and prior masks.