• Corpus ID: 19257023

Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts

@article{Zhu2017ImagineIF,
  title={Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts},
  author={Yizhe Zhu and Mohamed Elhoseiny and Bingchen Liu and A. Elgammal},
  journal={ArXiv},
  year={2017},
  volume={abs/1712.01381}
}
Most existing zero-shot learning methods consider the problem as a visual semantic embedding one. Given the demonstrated capability of Generative Adversarial Networks(GANs) to generate images, we instead leverage GANs to imagine unseen categories from text descriptions and hence recognize novel classes with no examples being seen. Specifically, we propose a simple yet effective generative model that takes as input noisy text descriptions about an unseen class (e.g.Wikipedia articles) and… 

Figures and Tables from this paper

Inspirational Adversarial Image Generation
TLDR
This work proposes a simple strategy to inspire creators with new generations learned from a dataset of their choice, while providing some control over the output by designing a simple optimization method to find the optimal latent parameters corresponding to the closest generation to any input inspirational image.
Zero-Shot Learning: An Energy Based Approach
TLDR
This paper proposes an Energy-Based Zero-shot Learning model (EBZL), which adapt tradition deep Boltzmann machine to a supervised setting without changing its property as an undirected probabilistic graphic model, which helps to preserve semantic integrity and circumvents semantic loss problem.
End-to-end Generative Zero-shot Learning via Few-shot Learning
TLDR
Z2FSL is introduced, an end-to-end generative ZSL framework that uses such an approach as a backbone and feeds its synthesized output to a Few-Shot Learning (FSL) algorithm, reducing, in effect, ZSL to FSL.
Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network
TLDR
This paper introduces accompanying hierarchical-nested adversarial objectives inside the network hierarchies, which regularize mid-level representations and assist generator training to capture the complex image statistics.
Progressive Ensemble Networks for Zero-Shot Recognition
  • Meng Ye, Yuhong Guo
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
A novel progressive ensemble network model with multiple projected label embeddings to address zero-shot image recognition and can naturally bridge the domain shift problem in visual appearances and be extended to the generalized zero- shot learning scenario.
Multi-Head Self-Attention via Vision Transformer for Zero-Shot Learning
TLDR
This work proposes an attention-based model in the problem settings of ZSL to learn attributes useful for unseen class recognition by using an attention mechanism adapted from Vision Transformer to capture and learn discriminative attributes by splitting images into small patches.
Implicit and Explicit Attention for Zero-Shot Learning
TLDR
The implicit attention mechanism is formulated with a self-supervised image angle rotation task, which focuses on specific image features aiding to solve the task, and has achieved the state-of-the-art harmonic mean on all the three datasets.
Self-Training Ensemble Networks for Zero-Shot Image Recognition
TLDR
The proposed self-training ensemble network model can naturally bridge the domain shift problem in visual appearances and be extended to the generalized zero-shot learning scenario and the empirical results demonstrate the efficacy of the proposed model.
Synthesis of High-Quality Visible Faces from Polarimetric Thermal Faces using Generative Adversarial Networks
TLDR
A generative adversarial networks based multi-stream feature-level fusion technique to synthesize high-quality visible images from polarimetric thermal images and demonstrates that the proposed method achieves state-of-the-art performance.
A Survey on Visual Transfer Learning using Knowledge Graphs
TLDR
A broad overview of knowledge graph embedding methods is provided and several joint training objectives suitable to combine them with high dimensional visual embeddings are described, to help researchers find meaningful evaluation benchmarks.
...
...

References

SHOWING 1-10 OF 59 REFERENCES
Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions
TLDR
A new model is presented that can classify unseen categories from their textual description and takes advantage of the architecture of CNNs and learn features at different layers, rather than just learning an embedding space for both modalities, as is common with existing approaches.
Zero-Shot Learning Through Cross-Modal Transfer
TLDR
This work introduces a model that can recognize objects in images even if no training data is available for the object class, and uses novelty detection methods to differentiate unseen classes from seen classes.
Improved Techniques for Training GANs
TLDR
This work focuses on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic, and presents ImageNet samples with unprecedented resolution and shows that the methods enable the model to learn recognizable features of ImageNet classes.
Synthesized Classifiers for Zero-Shot Learning
TLDR
This work introduces a set of "phantom" object classes whose coordinates live in both the semantic space and the model space and demonstrates superior accuracy of this approach over the state of the art on four benchmark datasets for zero-shot learning.
Less is More: Zero-Shot Learning from Online Textual Documents with Noise Suppression
TLDR
An l2,1-norm based objective function is proposed which can simultaneously suppress the noisy signal in the text and learn a function to match the text document and visual features and develops an optimization algorithm to efficiently solve the resulting problem.
Learning Robust Visual-Semantic Embeddings
TLDR
An end-to-end learning framework that is able to extract more robust multi-modal representations across domains and a novel technique of unsupervised-data adaptation inference is introduced to construct more comprehensive embeddings for both labeled and unlabeled data.
Synthesizing Samples for Zero-shot Learning
TLDR
A novel approach is proposed which turns the ZSL problem into a conventional supervised learning problem by synthesizing samples for the unseen classes by estimating the probability distribution of an unseen class by using the knowledge from seen classes and the class attributes.
Context Encoders: Feature Learning by Inpainting
TLDR
It is found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures, and can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.
Learning a Deep Embedding Model for Zero-Shot Learning
  • Li Zhang, T. Xiang, S. Gong
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
TLDR
This paper proposes to use the visual space as the embedding space instead of embedding into a semantic space or an intermediate space, and argues that in this space, the subsequent nearest neighbour search would suffer much less from the hubness problem and thus become more effective.
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
TLDR
An Attentional Generative Adversarial Network that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation and for the first time shows that the layered attentional GAN is able to automatically select the condition at the word level for generating different parts of the image.
...
...