Distribution Alignment: A Unified Framework for Long-tail Visual Recognition

  title={Distribution Alignment: A Unified Framework for Long-tail Visual Recognition},
  author={Songyang Zhang and Zeming Li and Shipeng Yan and Xuming He and Jian Sun},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Songyang ZhangZeming Li Jian Sun
  • Published 30 March 2021
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Despite the recent success of deep neural networks, it remains challenging to effectively model the long-tail class distribution in visual recognition tasks. To address this problem, we first investigate the performance bottleneck of the two-stage learning framework via ablative study. Motivated by our discovery, we propose a unified distribution alignment strategy for long-tail visual recognition. Specifically, we develop an adaptive calibration function that enables us to adjust the… 

A Simple Long-Tailed Recognition Baseline via Vision-Language Model

This work proposes BALLAD, a simple and effective approach to leverage contrastive vision-language models for long-tailed recognition that sets the new state-of-the-art performances and outperforms competitive baselines with a large margin.

Deep Long-Tailed Learning: A Survey

A comprehensive survey on recent advances in deep long-tailed learning is provided, highlighting important applications of deepLongtailed learning and identifying several promising directions for future research.

Long-Tail Instance Segmentation Based on Memory Bank and Confidence Calibration

Anobject-centric memory bank is used to establish an object-centric storage strategy that can solve the imbalance problem with respect to categories and improves segmentation accuracy.

Retrieval Augmented Classification for Long-Tail Visual Recognition

Retrieval Augmented Classification is introduced, a generic approach to augmenting standard image classification pipelines with an explicit retrieval module that learns a high level of accuracy on tail classes and is applied to the problem of long-tail classification.

Feature Re-Balancing for Long-Tailed Visual Recognition

A novel re-balancing framework, Feature Re-Balancing (FeatRB), which directly re- Balancing the distribution in the feature space by combining the long-tailed initial features and the generated virtual features, and which surpasses the current state-of-the-art methods.

Feature-Balanced Loss for Long-Tailed Visual Recognition

This paper addresses the long-tailed problem from feature space and proposes the feature-balanced loss, which encourages larger feature norms of tail classes by giving them relatively stronger stimuli.

A Survey on Long-Tailed Visual Recognition

This survey focuses on the problems caused by long-tailed data distribution, sort out the representative long-tails visual recognition datasets and summarize some mainstream long-tail studies, and quantitatively study 20 widely-used and large-scale visual datasets proposed in the last decade.

Inverse Image Frequency for Long-tailed Image Recognition

Inverse Image Frequency is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network that achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections.

Learning with Free Object Segments for Long-Tailed Instance Segmentation

This paper investigates the simi-larity among object-centric images of the same class to propose candidate segments of foreground instances, followed by a novel ranking of segment quality and proposes a simple and scalable framework F REE S EG for ex-tracting and leveraging these “free” object segments to facilitate model training.

An EM Framework for Online Incremental Learning of Semantic Segmentation

A unified learning strategy based on the Expectation-Maximization (EM) framework, which integrates an iterative relabeling strategy that fills in the missing labels and a rehearsal-based incremental learning step that balances the stability-plasticity of the model.



BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition

A unified Bilateral-Branch Network (BBN) is proposed to take care of both representation learning and classifier learning simultaneously, where each branch does perform its own duty separately.

Decoupling Representation and Classifier for Long-Tailed Recognition

It is shown that it is possible to outperform carefully designed losses, sampling strategies, even complex modules with memory, by using a straightforward approach that decouples representation and classification.

Learning to Segment the Tail

This work proposes a “divide&conquer” strategy for the challenging LVIS task: divide the whole data into balanced parts and then apply incremental learning to conquer each one, which derives a novel learning paradigm: class-incremental few-shot learning, which is especially effective for the challenge evolving over time.

Equalization Loss for Long-Tailed Object Recognition

This work proposes a simple but effective loss, named equalization loss, to tackle the problem of long-tailed rare categories by simply ignoring those gradients for rare categories, and wins the 1st place in the LVIS Challenge 2019.

The Devil is in Classification: A Simple Framework for Long-tail Instance Segmentation

This work systematically investigates performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset, and unveils that a major cause is the inaccurate classification of object proposals.

LVIS: A Dataset for Large Vocabulary Instance Segmentation

This work introduces LVIS (pronounced ‘el-vis’): a new dataset for Large Vocabulary Instance Segmentation, which has a long tail of categories with few training samples due to the Zipfian distribution of categories in natural images.

Sharing Representations for Long Tail Computer Vision Problems

Several embedding approaches are presented, in increasing levels of complexity, to show how to tackle the long tail problem, from rare classes to unseen classes in image classification (the so-called zero-shot setting).

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

Overcoming Classifier Imbalance for Long-Tail Object Detection With Balanced Group Softmax

  • Yu LiTao Wang Jiashi Feng
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
This work provides the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution and proposes a novel balanced group softmax (BAGS) module for balancing the classifiers within the detection frameworks through group-wise training.

Exploring the Limits of Weakly Supervised Pretraining

This paper presents a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images and shows improvements on several image classification and object detection tasks, and reports the highest ImageNet-1k single-crop, top-1 accuracy to date.