• Corpus ID: 244954810

Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition

  title={Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition},
  author={Zhenxin Wu and Qingliang Chen and Yifeng Liu and Yinqi Zhang and Chengkai Zhu and Yang Yu},
Fine-grained Visual Classification (FGVC) aims to identify objects from subcategories. It is a very challenging task because of the subtle inter-class differences. Existing research applies large-scale convolutional neural networks or visual transformers as the feature extractor, which is extremely computationally expensive. In fact, real-world scenarios of fine-grained recognition often require a more lightweight mobile network that can be utilized offline. However, the fundamental mobile… 

Figures and Tables from this paper



Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches

This work proposes a novel framework for fine-grained visual classification with a progressive training strategy that effectively fuses features from different granularities, and a random jigsaw patch generator that encourages the network to learn features at specificgranularities.

TransFG: A Transformer Architecture for Fine-grained Recognition

The augmented transformer-based model TransFG is named and the value of it is demonstrated by conducting experiments on five popular fine-grained benchmarks where it achieves state-of-the-art performance.

Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition

This paper proposes a novel part learning approach by a multi-attention convolutional neural network (MA-CNN), where part generation and feature learning can reinforce each other, and shows the best performances on three challenging published fine-grained datasets.

Hierarchical Part Matching for Fine-Grained Visual Categorization

A powerful flowchart named Hierarchical Part Matching (HPM) is proposed to cope with fine-grained classification tasks and achieves the state-of-the-art classification accuracy in the Caltech-UCSD-Birds-200-2011 dataset by making full use of the ground-truth part annotations.

IU-Module: Intersection and Union Module for Fine-Grained Visual Classification

The proposed IU-Module imposes two straightforward operations, namely channel intersection (CI) and channel union (CU) operations, on the convolutional features and achieves competitive results compared with the state-of-the-art methods.

Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition

A novel recurrent attention convolutional neural network (RA-CNN) which recursively learns discriminative region attention and region-based feature representation at multiple scales in a mutual reinforced way and achieves the best performance in three fine-grained tasks.

Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained Image Recognition

The proposed Mask-CNN model has the smallest number of parameters, lowest feature dimensionality and highest recognition accuracy when compared with state-of-the-arts fine-grained approaches.

Part-Stacked CNN for Fine-Grained Visual Categorization

A novel Part-Stacked CNN architecture that explicitly explains the finegrained recognition process by modeling subtle differences from object parts is proposed, from multiple perspectives of classification accuracy, model interpretability, and efficiency.

Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition

A cross-layer bilinear pooling approach is proposed to capture the inter-layer part feature relations, which results in superior performance compared with other bilinears pooling based approaches.

Pairwise Confusion for Fine-Grained Visual Classification

This work addresses overfitting in end-to-end neural network training on FGVC tasks using a novel optimization procedure, called Pairwise Confusion (PC), which reduces overfitting by intentionally introducing confusion in the activations.