Revisiting Few-Shot Learning from a Causal Perspective

@article{Lin2022RevisitingFL,
  title={Revisiting Few-Shot Learning from a Causal Perspective},
  author={Guoliang Lin and Hanjiang Lai},
  journal={ArXiv},
  year={2022},
  volume={abs/2209.13816}
}
Few-shot learning with N -way K-shot scheme is an open challenge in machine learning. Many approaches have been proposed to tackle this problem, e.g., the Matching Networks and CLIP-Adapter. Despite that these approaches have shown significant progress, the mechanism of why these methods succeed has not been well explored. In this paper, we interpret these few-shot learning methods via causal mechanism. We show that the existing approaches can be viewed as specific forms of front-door… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 34 REFERENCES

Interventional Few-Shot Learning

It is revealed that the contribution of IFSL is orthogonal to existing fine-tuning and meta-learning based FSL methods, hence IFSL can improve all of them, achieving a new 1-/5-shot state-of-the-art on \textit{mini}ImageNet, \text it{tiered}Image net, and cross-domain CUB.

Generalizing from a Few Examples: A Survey on Few-Shot Learning

A thorough survey to fully understand Few-Shot Learning (FSL), and categorizes FSL methods from three perspectives: data, which uses prior knowledge to augment the supervised experience; model, which used to reduce the size of the hypothesis space; and algorithm, which using prior knowledgeto alter the search for the best hypothesis in the given hypothesis space.

Generalized Few-Shot Object Detection without Forgetting

Extensive experiments show that Retentive R-CNN significantly outperforms state-of-the-art methods on overall performance among all settings as it can achieve competitive results on few-shot classes and does not degrade the base class performance at all.

Matching Networks for One Shot Learning

This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.

Finding Task-Relevant Features for Few-Shot Learning by Category Traversal

A Category Traversal Module is introduced that can be inserted as a plug-and-play module into most metric-learning based few-shot learners, identifying task-relevant features based on both intra-class commonality and inter-class uniqueness in the feature space.

Representation Learning via Invariant Causal Mechanisms

A novel self-supervised objective, Representation Learning via Invariant Causal Mechanisms (ReLIC), is proposed that enforces invariant prediction of proxy targets across augmentations through an invariance regularizer which yields improved generalization guarantees.

Deconfounded Image Captioning: A Causal Retrospect

This paper presents a novel perspective: Deconfounded Image Captioning (DIC), to find out the answer of this question, then retrospect modern neural image captioners, and proposes a DIC framework: DICv1.0 to alleviate the negative effects brought by dataset bias.

Learning Transferable Visual Models From Natural Language Supervision

It is demonstrated that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet.

Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation

This work proposes a dynamic prototype convolution network (DPCN) to fully capture the aforementioned intrinsic details for accurate FSS, and shows that DPCN yields superior performances under both 1-shot and 5-shot settings.

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

This paper introduces a contrastive loss to ALign the image and text representations BEfore Fusing through cross-modal attention, which enables more grounded vision and language representation learning and proposes momentum distillation, a self-training method which learns from pseudo-targets produced by a momentum model.