Corpus ID: 236469181

Normalization Matters in Weakly Supervised Object Localization

@article{Kim2021NormalizationMI,
  title={Normalization Matters in Weakly Supervised Object Localization},
  author={Jeesoo Kim and Junsuk Choe and Sangdoo Yun and Nojun Kwak},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.13221}
}
Weakly-supervised object localization (WSOL) enables finding an object using a dataset without any localization information. By simply training a classification model using only image-level annotations, the feature map of the model can be utilized as a score map for localization. In spite of many WSOL methods proposing novel strategies, there has not been any de facto standard about how to normalize the class activation map (CAM). Consequently, many WSOL methods have failed to fully exploit… Expand

References

SHOWING 1-10 OF 34 REFERENCES
Evaluating Weakly Supervised Object Localization Methods Right
TLDR
It is argued that WSOL task is ill-posed with only image-level labels, and a new evaluation protocol is proposed where full supervision is limited to only a small held-out set not overlapping with the test set. Expand
Attention-Based Dropout Layer for Weakly Supervised Object Localization
  • Junsuk Choe, Hyunjung Shim
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
An Attention-based Dropout Layer (ADL), which utilizes the self-attention mechanism to process the feature maps of the model to improve the accuracy of WSOL, achieving a new state-of-the-art localization accuracy in CUB-200-2011 dataset. Expand
Self-produced Guidance for Weakly-supervised Object Localization
TLDR
Self-produced Guidance (SPG) masks which separate the foreground i.e., the object of interest, from the background to provide the classification networks with spatial correlation information of pixels are proposed. Expand
Adversarial Complementary Learning for Weakly Supervised Object Localization
TLDR
This work mathematically proves that class localization maps can be obtained by directly selecting the class-specific feature maps of the last convolutional layer, which paves a simple way to identify object regions and presents a simple network architecture including two parallel-classifiers for object localization. Expand
CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features
TLDR
Patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches, and CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task. Expand
Focal Loss for Dense Object Detection
TLDR
This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. Expand
Focal Loss for Dense Object Detection
TLDR
This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. Expand
Learning Deep Features for Discriminative Localization
In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization abilityExpand
You Only Look Once: Unified, Real-Time Object Detection
TLDR
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork. Expand
Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization
TLDR
The key idea is to hide patches in a training image randomly, forcing the network to seek other relevant parts when the most discriminative part is hidden, which obtains superior performance compared to previous methods for weakly-supervised object localization on the ILSVRC dataset. Expand
...
1
2
3
4
...