• Corpus ID: 67749964

Adaptive Cross-Modal Few-Shot Learning

  title={Adaptive Cross-Modal Few-Shot Learning},
  author={Chen Xing and Negar Rostamzadeh and Boris N. Oreshkin and Pedro H. O. Pinheiro},
Metric-based meta-learning techniques have successfully been applied to few-shot classification problems. In this paper, we propose to leverage cross-modal information to enhance metric-based few-shot learning methods. Visual and semantic feature spaces have different structures by definition. For certain concepts, visual features might be richer and more discriminative than text ones. While for others, the inverse might be true. Moreover, when the support from visual information is limited in… 

Figures and Tables from this paper

Learning Class Prototypes Via Anisotropic Combination of Aligned Modalities for Few-Shot Learning

It is argued that proper alignment method is important to improve the performance of cross-modal methods, as query data only has visual information in the few-shot learning tasks.

Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models

This work demonstrates that one can indeed build a better visual dog classifier by read ing about dogs and listen ing to them bark and proposes a simple cross-modal adaptation approach that learns from few-shot examples spanning different modalities.

Multimodal Few-Shot Object Detection with Meta-Learning Based Cross-Modal Prompting

It is shown that meta-learning and prompt-based learning, the most commonly-used methods for few-shot learning and zero-shot transferring from pre-trained vision-language models to downstream tasks, are conceptually similar, and it is proposed to combine meta- learning with prompt- based learning for multimodal FSOD without tuning.

Shaping Visual Representations with Attributes for Few-Shot Learning

This work proposes attribute-shaped learning (ASL), which can normalize visual representations to predict attributes for query images and devise an attribute-visual attention module (AVAM), which utilizes attributes to generate more discriminative features.

KAN: Knowledge-Augmented Networks for Few-Shot Learning

  • Zeyang ZhuXin Lin
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
Knowledge-Augmented Networks is proposed, which combines the visual features with the semantic information extracted from knowledge graph to represent the features of each class, and demonstrates the effectiveness of the method on standard few-shot learning tasks, and observes that with the augmented semantic information fromknowledge graph, KAN is able to learn more disentangled representations.

Feature Transformation Network for Few-Shot Learning

An attention-based affinity matrix is introduced to transform the semantical enhanced embedding vectors of query samples by associating the support set, thereby guiding the network to learn a sample representation that embodies higher semantic information in the target area.

Few-shot Learning with Contextual Cueing for Object Recognition in Complex Scenes

This work proposes a Class-conditioned Context Attention Module (CCAM) that learns to weight the most important context elements while learning a particular concept and proposes a flexible gating mechanism to ground visual class representations in context semantics.

Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition

An attributes-guided attention module (AGAM) is devised to utilize human-annotated attributes and learn more discriminative features in few-shot recognition and can significantly improve simple metric-based approaches to achieve state-of-the-art performance on different datasets and settings.

Extended Few-Shot Learning: Exploiting Existing Resources for Novel Tasks

A masking module is proposed that adjusts the features of auxiliary data to be more similar to those of the target classes and performs better than naively modeling the support examples and transfer learning by 4.68 and 6.03 percentage points, respectively.



Learning Robust Visual-Semantic Embeddings

An end-to-end learning framework that is able to extract more robust multi-modal representations across domains and a novel technique of unsupervised-data adaptation inference is introduced to construct more comprehensive embeddings for both labeled and unlabeled data.

Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders

This work proposes a model where a shared latent space of image features and class embeddings is learned by modality-specific aligned variational autoencoders, and align the distributions learned from images and from side-information to construct latent features that contain the essential multi-modal information associated with unseen classes.

Learning Compositional Representations for Few-Shot Recognition

This work introduces a simple regularization technique that allows the learned representation to be decomposable into parts, and demonstrates the value of compositional representations on three datasets and shows that they require fewer examples to learn classifiers for novel categories.

Meta-Learning for Semi-Supervised Few-Shot Classification

This work proposes novel extensions of Prototypical Networks that are augmented with the ability to use unlabeled examples when producing prototypes, and confirms that these models can learn to improve their predictions due to unlabeling examples, much like a semi-supervised algorithm would.

Discriminative k-shot learning using probabilistic models

It is shown that even a simple probabilistic model achieves state-of-the-art on a standard k-shot learning dataset by a large margin and is able to accurately model uncertainty, leading to well calibrated classifiers, and is easily extensible and flexible, unlike many recent approaches to k- shot learning.

Learning to Compare: Relation Network for Few-Shot Learning

A conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each, which is easily extended to zero- shot learning.

TADAM: Task dependent adaptive metric for improved few-shot learning

This work identifies that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms and proposes and empirically test a practical end-to-end optimization procedure based on auxiliary task co-training to learn a task-dependent metric space.

Low-Shot Learning from Imaginary Data

This work builds on recent progress in meta-learning by combining a meta-learner with a "hallucinator" that produces additional training examples, and optimizing both models jointly, yielding state-of-the-art performance on the challenging ImageNet low-shot classification benchmark.

Matching Networks for One Shot Learning

This work employs ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories to learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.

Optimization as a Model for Few-Shot Learning