Towards Recognizing Unseen Categories in Unseen Domains

@article{Mancini2020TowardsRU,
  title={Towards Recognizing Unseen Categories in Unseen Domains},
  author={Massimiliano Mancini and Zeynep Akata and Elisa Ricci and Barbara Caputo},
  journal={ArXiv},
  year={2020},
  volume={abs/2007.12256}
}
Current deep visual recognition systems suffer from severe performance degradation when they encounter new images from classes and scenarios unseen during training. Hence, the core challenge of Zero-Shot Learning (ZSL) is to cope with the semantic-shift whereas the main challenge of Domain Adaptation and Domain Generalization (DG) is the domain-shift. While historically ZSL and DG tasks are tackled in isolation, this work develops with the ambitious goal of solving them jointly,i.e. by… 

Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains

TLDR
A novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains and their class-specific semantic representations to a common latent space is proposed.

COCOA: Context-Conditional Adaptation for Recognizing Unseen Classes in Unseen Domains

TLDR
This work proposes a feature generative framework integrated with a COntext COnditional Adaptive (COCOA) Batch-Normalization layer to seamlessly integrate class-level semantic and domain-specific information and demonstrates promising performance over baselines and state-of-the-art methods.

Domain-Aware Continual Zero-Shot Learning

TLDR
This work introduces Domain Aware Continual Zero-Shot Learning (DACZSL), the task of visually recognizing images of unseen categories in unseen domains sequentially, and proposes a novel DomainInvariant CZSL Network (DIN), which outperforms state-of-the-art baseline models that are adapted to DACZSL setting.

Towards Recognizing New Semantic Concepts in New Visual Domains

TLDR
This thesis argues that it is crucial to design deep architectures that can operate in previously unseen visual domains and recognize novel semantic concepts, and proposes an approach based on domain and semantic mixing of inputs and features, which is a first, promising step towards solving this problem.

On the Challenges of Open World Recognition Under Shifting Visual Domains

TLDR
This letter investigates whether O WR algorithms are effective under domain-shift, presenting the first benchmark setup for assessing fairly the performances of OWR algorithms, with and without domain- Shift, and shows how existing OWR algorithm suffer a severe performance degradation when train and test distributions differ.

TTT-UCDR: Test-time Training for Universal Cross-Domain Retrieval

TLDR
This work uses test-time training techniques for adapting to distribution shifts under Universal Cross-Domain Retrieval (UCDR) to bridge the domain gap, which leads to improvements on UCDR benchmarks and also improves model robustness under a challenging cross-dataset generalization setting.

INDIGO: Intrinsic Multimodality for Domain Generalization

TLDR
This work proposes IntriNsic multimodality for DomaIn GeneralizatiOn (INDIGO), a simple and elegant way of leveraging the intrinsic modality present in these pre-trained multimodal networks along with the visual modality to enhance generalization to unseen domains at test-time.

Domain Generalization in Vision: A Survey

TLDR
A comprehensive literature review is provided to summarize the developments in DG for computer vision over the past decade and conducts a thorough review into existing methods and presents a categorization based on their methodologies and motivations.

Mode-Guided Feature Augmentation for Domain Generalization

TLDR
This paper proposes a simple andcient DG approach to augment source domain(s) by hypothesizing the existence of favourable correlation between the source and target domain’s major modes of variation, and upon exploring those modes in the source domain the authors can realize meaningful alterations to background, appearance, pose and texture of object classes.

DecAug: Out-of-Distribution Generalization via Decomposed Feature Representation and Semantic Augmentation

TLDR
DecAug is proposed, a novel decomposed feature representation and semantic augmentation approach for OoD generalization that outperforms other state-of-the-art methods on various OoD datasets, which is among the very few methods that can deal with different types of OoDgeneralization challenges.

References

SHOWING 1-10 OF 63 REFERENCES

Synthesized Classifiers for Zero-Shot Learning

TLDR
This work introduces a set of "phantom" object classes whose coordinates live in both the semantic space and the model space and demonstrates superior accuracy of this approach over the state of the art on four benchmark datasets for zero-shot learning.

Deeper, Broader and Artier Domain Generalization

TLDR
This paper builds upon the favorable domain shift-robust properties of deep learning methods, and develops a low-rank parameterized CNN model for end-to-end DG learning that outperforms existing DG alternatives.

Zero-Shot Learning via Semantic Similarity Embedding

In this paper we consider a version of the zero-shot learning problem where seen class source and target domain data are provided. The goal during test-time is to accurately predict the class label

Unsupervised Domain Adaptation for Zero-Shot Learning

TLDR
A novel ZSL method is proposed based on unsupervised domain adaptation which uses the target domain class labels' projections in the semantic space to regularise the learned target domain projection thus effectively overcoming the projection domain shift problem.

Episodic Training for Domain Generalization

TLDR
Using the Visual Decathlon benchmark, it is demonstrated that the episodic-DG training improves the performance of such a general purpose feature extractor by explicitly training a feature for robustness to novel problems, showing that DG training can benefit standard practice in computer vision.

Feature Generating Networks for Zero-Shot Learning

TLDR
A novel generative adversarial network (GAN) that synthesizes CNN features conditioned on class-level semantic information, offering a shortcut directly from a semantic descriptor of a class to a class-conditional feature distribution.

Zero-Shot Deep Domain Adaptation

TLDR
ZDDA is the first domain adaptation and sensor fusion method which requires no task-relevant target-domain data and the underlying principle is not particular to computer vision data, but should be extensible to other domains.

Transductive Multi-View Zero-Shot Learning

TLDR
A novel heterogeneous multi-view hypergraph label propagation method is formulated for zero-shot learning in the transductive embedding space that rectifies the projection shift between the auxiliary and target domains, exploits the complementarity of multiple semantic representations, and significantly outperforms existing methods for both zero- shot and N-shot recognition.

F-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning

TLDR
A conditional generative model that combines the strength of VAE and GANs and in addition, via an unconditional discriminator, learns the marginal feature distribution of unlabeled images is developed.

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

TLDR
Semantic-Guided Matching Discrepancy (SGMD) is proposed, which first employs instance matching between S and T, and then the discrepancy is measured by a weighted feature distance between matched instances, and a limited balance constraint is designed to achieve a more balanced classification output on known and unknown categories.
...