Cross-X Learning for Fine-Grained Visual Categorization

@article{Luo2019CrossXLF,
  title={Cross-X Learning for Fine-Grained Visual Categorization},
  author={Wei Luo and Xitong Yang and Xianjie Mo and Yuheng Lu and Larry S. Davis and Jusong Li and Jian Yang and Ser-Nam Lim},
  journal={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2019},
  pages={8241-8250}
}
Recognizing objects from subcategories with very subtle differences remains a challenging task due to the large intra-class and small inter-class variation. Recent work tackles this problem in a weakly-supervised manner: object parts are first detected and the corresponding part-specific features are extracted for fine-grained classification. However, these methods typically treat the part-specific features of each image in isolation while neglecting their relationships between different images… Expand
Alignment Enhancement Network for Fine-grained Visual Categorization
Fine-grained visual categorization (FGVC) aims to automatically recognize objects from different sub-ordinate categories. Despite attracting considerable attention from both academia and industry, itExpand
Progressive Co-Attention Network for Fine-grained Visual Classification
TLDR
The proposed progressive co-attention network can be trained in an end-to-end manner, and only requires image-level label supervision, and has achieved competitive results on three fine-grained visual classification benchmark datasets: CUB200-2011, Stanford Cars, and FGVC Aircraft. Expand
Label-Smooth Learning for Fine-Grained Visual Categorization
TLDR
This paper proposes a label-smooth learning method that improves models applicability to large categories by maximizing its prediction diversity and demonstrates its comparable or state-of-the-art performance on five benchmark datasets. Expand
Weakly Supervised Fine-Grained Image Recognition Based on Multi-Channel Attention and Object Localization
TLDR
A fine-grained image recognition method based on multi-channel attention and object localization, which mainly includes two parts: multi-Channel attention (MCA) module is used to learn different discriminative regions, attention object location (AOL) module can locate the object from the input image. Expand
Fine-Grained Visual Classification via Simultaneously Learning of Multi-regional Multi-grained Features
TLDR
It is argued that mining multi-regional multi-grained features is precisely the key to this task, and a new loss function is introduced, termed top-down spatial attention loss (TDSA-Loss), which contains a multi-stage channel constrained module and a top- down spatial attention module. Expand
A Saliency-based Weakly-supervised Network for Fine-Grained Image Categorization
  • Yawen Han, Fang Meng
  • Computer Science
  • 2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)
  • 2020
TLDR
This paper proposes a saliency-based weakly-supervised network, a simple approach that can effectively preserve all the salient details of the original image and applies feature pyramid network structure to combine the saliency information with high-level features and use KL-divergence for knowledge distillation. Expand
Associating Multi-Scale Receptive Fields For Fine-Grained Recognition
TLDR
A novel cross-layer non-local (CNL) module to associate multi-scale receptive fields by two operations that builds spatial dependencies among multi-level layers and learns more discriminative features. Expand
Variational Transfer Learning for Fine-grained Few-shot Visual Recognition
TLDR
This paper model the distribution of intra-class variance on the base set via variational inference and transfers the learned distribution to the novel set to generate additional features, which are used together with the original ones to train a classifier. Expand
Multi-Order Feature Statistical Model for Fine-Grained Visual Categorization
TLDR
A multi-order feature statistical method (MOFS), which learns fine-grained features characterizing multiple orders by deploying two sub-modules on the top of existing backbone networks, which simultaneously captures multi-level of discriminative patters including local, global and co-related patters. Expand
Context-aware Attentional Pooling (CAP) for Fine-grained Visual Classification
TLDR
This work proposes a novel context-aware attentional pooling (CAP) that effectively captures subtle changes via sub-pixel gradients, and learns to attend informative integral regions and their importance in discriminating different subcategories without requiring the bounding-box and/or distinguishable part annotations. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 50 REFERENCES
Embedding Label Structures for Fine-Grained Feature Representation
TLDR
The proposed multitask learning framework significantly outperforms previous fine-grained feature representations for image retrieval at different levels of relevance and to model the multi-level relevance, label structures such as hierarchy or shared attributes are seamlessly embedded into the framework by generalizing the triplet loss. Expand
Multiple Granularity Descriptors for Fine-Grained Categorization
TLDR
This work leverages the fact that a subordinate-level object already has other labels in its ontology tree to train a series of CNN-based classifiers, each specialized at one grain level, which outperforms state-of-the-art algorithms, including those requiring strong labels. Expand
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition
TLDR
A novel attention-based convolutional neural network (CNN) which regulates multiple object parts among different input images, which can be easily trained end-to-end, and is highly efficient which requires only one training stage. Expand
Hyper-class augmented and regularized deep learning for fine-grained image classification
TLDR
A systematic framework of learning a deep CNN that addresses the challenges from two new perspectives by identifying easily annotated hyper-classes inherent in the fine-grained data and acquiring a large number of hyper-class-labeled images from readily available external sources is proposed. Expand
The application of two-level attention models in deep convolutional neural network for fine-grained image classification
TLDR
This paper proposes to apply visual attention to fine-grained classification task using deep neural network and achieves the best accuracy under the weakest supervision condition, and is competitive against other methods that rely on additional annotations. Expand
Picking Deep Filter Responses for Fine-Grained Image Recognition
TLDR
This paper proposes an automatic fine-grained recognition approach which is free of any object / part annotation at both training and testing stages, and conditionally pick deep filter responses to encode them into the final representation, which considers the importance of filter responses themselves. Expand
Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition
TLDR
This paper proposes a novel part learning approach by a multi-attention convolutional neural network (MA-CNN), where part generation and feature learning can reinforce each other, and shows the best performances on three challenging published fine-grained datasets. Expand
Fully Convolutional Attention Localization Networks: Efficient Attention Localization for Fine-Grained Recognition
TLDR
It is shown that zooming in on the selected attention regions significantly improves the performance of fine-grained recognition, and the proposed approach is noticeably more computationally efficient during both training and testing because of its fully-convolutional architecture. Expand
Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization
TLDR
The proposed method improves classification accuracy in two ways: more discriminative CNN feature representations are generated using a training set augmented by collecting a large number of part patches from weakly supervised web images, and more robust object classifiers are learned using a multi-instance learning algorithm jointly on the strong and weak datasets. Expand
Learning to Navigate for Fine-grained Classification
TLDR
This work proposes a novel self-supervision mechanism to effectively localize informative regions without the need of bounding-box/part annotations, and designs a novel training paradigm, which enables Navigator to detect most informative regions under the guidance from Teacher. Expand
...
1
2
3
4
5
...