Latent Embeddings for Zero-Shot Classification

@article{Xian2016LatentEF,
  title={Latent Embeddings for Zero-Shot Classification},
  author={Yongqin Xian and Zeynep Akata and Gaurav Sharma and Quynh N. Nguyen and Matthias Hein and Bernt Schiele},
  journal={2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2016},
  pages={69-77}
}
We present a novel latent embedding model for learning a compatibility function between image and class embeddings, in the context of zero-shot classification. The proposed method augments the state-of-the-art bilinear compatibility model by incorporating latent variables. Instead of learning a single bilinear map, it learns a collection of maps with the selection, of which map to use, being a latent variable for the current image-class pair. We train the model with a ranking based objective… 
Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders
TLDR
This work proposes a model where a shared latent space of image features and class embeddings is learned by modality-specific aligned variational autoencoders, and align the distributions learned from images and from side-information to construct latent features that contain the essential multi-modal information associated with unseen classes.
Structure Aligning Discriminative Latent Embedding for Zero-Shot Learning
TLDR
Apart from remarkably reducing the so-called semantic gap, the discriminative property of the learned latent layer representations entails improved classification performance on both the standard zero-shot learning (ZSL) and the challenging generalized ZSL (GZ SL) setups on three benchmark datasets.
A Simple Approach for Zero-Shot Learning based on Triplet Distribution Embeddings
TLDR
This work addresses the issue of expressivity in terms of modeling the intra-class variability for each class in Zero-Shot Learning by leveraging the use of distribution embeddings, which are modeled as Gaussian distributions.
Cross-Linked Variational Autoencoders for Generalized Zero-Shot Learning
TLDR
This work proposes a model where a shared latent space of image features and class embeddings is learned by aligned variational autoencoders, for the purpose of generating latent features to train a softmax classifier.
Generalized Zero-Shot Learning via Aligned Variational Autoencoders
TLDR
This work proposes a model where a shared latent space of image features and class embeddings is learned by aligned variational autoencoders, for the purpose of generating latent features to train a softmax classifier and establishes a new state of the art on generalized zero-shot learning.
Zero-Shot Learning via Joint Latent Similarity Embedding
TLDR
A joint discriminative learning framework based on dictionary learning is developed to jointly learn the parameters of the model for both domains, which ultimately leads to a class-independent classifier that shows 4.90% improvement over the state-of-the-art in accuracy averaged across four benchmark datasets.
Zero-shot Learning using Graph Regularized Latent Discriminative Cross-domain Triplets
TLDR
This work introduces a ZSL framework by leveraging the intuitive idea of cross-domain triplets based metric learning for learning such a space, and introduces a novel graph Laplacian based regularizer which aligns the graph structures of the visual and semantic spaces in the learned embedding space.
Learning Aligned Cross-Modal Representation for Generalized Zero-Shot Classification
TLDR
A novel Vision-Semantic Alignment (VSA) method to strengthen the alignment of cross-modal latent features on the latent subspaces guided by a learned classifier for Generalized Zero-Shot Classification (GZSC).
Attributes2Classname: A Discriminative Model for Attribute-Based Unsupervised Zero-Shot Learning
TLDR
Contrary to the traditional zero-shot learning approaches that are built upon attribute presence, the proposed approach bypasses the laborious attributeclass relation annotations for unseen classes, hence, the training can be augmented without the need to collect additional image data.
Transductive Zero-Shot Learning With a Self-Training Dictionary Approach
TLDR
A bidirectional mapping-based semantic relationship modeling scheme that seeks for cross-modal knowledge transfer by simultaneously projecting the image features and label embeddings into a common latent space is proposed.
...
...

References

SHOWING 1-10 OF 46 REFERENCES
Zero-Shot Learning via Joint Latent Similarity Embedding
TLDR
A joint discriminative learning framework based on dictionary learning is developed to jointly learn the parameters of the model for both domains, which ultimately leads to a class-independent classifier that shows 4.90% improvement over the state-of-the-art in accuracy averaged across four benchmark datasets.
Zero-Shot Learning by Convex Combination of Semantic Embeddings
TLDR
A simple method for constructing an image embedding system from any existing image classifier and a semantic word embedding model, which contains the $\n$ class labels in its vocabulary is proposed, which outperforms state of the art methods on the ImageNet zero-shot learning task.
Label-Embedding for Image Classification
TLDR
This work proposes to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors, and introduces a function that measures the compatibility between an image and a label embedding.
Evaluation of output embeddings for fine-grained image classification
TLDR
This project shows that compelling classification performance can be achieved on fine-grained categories even without labeled training data, and establishes a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets.
Zero-Shot Learning Through Cross-Modal Transfer
TLDR
This work introduces a model that can recognize objects in images even if no training data is available for the object class, and uses novelty detection methods to differentiate unseen classes from seen classes.
DeViSE: A Deep Visual-Semantic Embedding Model
TLDR
This paper presents a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text and shows that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training.
Evaluating knowledge transfer and zero-shot learning in a large-scale setting
TLDR
An extensive evaluation of three popular approaches to KT on a recently proposed large-scale data set, the ImageNet Large Scale Visual Recognition Competition 2010 dataSet, finding none of the KT methods can improve over one-vs-all classification but prove valuable for zero-shot learning, especially hierarchical and direct similarity based KT.
Relative attributes
TLDR
This work proposes a generative model over the joint space of attribute ranking outputs, and proposes a novel form of zero-shot learning in which the supervisor relates the unseen object category to previously seen objects via attributes (for example, ‘bears are furrier than giraffes’).
An embarrassingly simple approach to zero-shot learning
TLDR
This paper describes a zero-shot learning approach that can be implemented in just one line of code, yet it is able to outperform state of the art approaches on standard datasets.
Learning Hypergraph-regularized Attribute Predictors
TLDR
A novel attribute learning framework named Hypergraph-based Attribute Predictor, which is casted as a regularized hypergraph cut problem, in which a collection of attribute projections is jointly learnt from the feature space to a hypergraph embedding space aligned with the attributes.
...
...