Scaling Object Detection by Transferring Classification Weights

@article{Kuen2019ScalingOD,
  title={Scaling Object Detection by Transferring Classification Weights},
  author={Jason Kuen and Federico Perazzi and Zhe L. Lin and Jianming Zhang and Yap-Peng Tan},
  journal={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2019},
  pages={6043-6052}
}
Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count. Yet, the number of object-level categories annotated in detection datasets is an order of magnitude smaller than image-level classification labels. State-of-the art object detection models are trained in a supervised fashion and this limits the number of object classes they can detect. In this paper, we propose a novel weight transfer network (WTN) to effectively… 

Figures and Tables from this paper

Class-agnostic Object Detection
TLDR
This work proposes class-agnostic object detection as a new problem that focuses on detecting objects irrespective of their object-classes, and proposes a new adversarial learning framework that forces the model to exclude class-specific information from features used for predictions.
Cross-Supervised Object Detection
TLDR
This work proposes a unified framework that combines a detection head trained from instance- level annotations and a recognition head learned from image-level annotations, together with a spatial correlation module that bridges the gap between detection and recognition.
Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity
TLDR
Mixed supervision object detection with mixed supervision is considered, which learns novel object categories using weak annotations with the help of full annotations of existing base object categories, and further transfer mask prior and semantic similarity to bridge the gap between novel categories and base categories.
Proper Reuse of Image Classification Features Improves Object Detection
TLDR
It is shown that an extreme form of knowledge preservation—freezing the classifier-initialized backbone— consistently improves many different detection models, and leads to considerable resource savings.
Boosting Weakly Supervised Object Detection with Progressive Knowledge Transfer
TLDR
An effective knowledge transfer framework to boost the weakly supervised object detection accuracy with the help of an external fully-annotated source dataset, whose categories may not overlap with the target domain, is proposed.
Weak Novel Categories without Tears: A Survey on Weak-Shot Learning
  • Li Niu
  • Computer Science
    ArXiv
  • 2021
TLDR
This paper discusses the existing weakshot learning methodologies in different tasks and summarizes the codes at https://github.com/bcmi/Awesome-Weak-Shot-Learning and treats it as weakly supervised learning with auxiliary fully supervised categories.
Instance-Specific Feature Propagation for Referring Segmentation
TLDR
This work proposes a novel framework that simultaneously detects the target-of-interest via feature propagation and generates a coarse-grained segmentation mask for the target instance indicated by a natural language expression.
Learning to Localize Actions from Moments
TLDR
This paper introduces a new design of transfer learning type to learn action localization for a large set of action categories, but only on action moments from the categories of interest and temporal annotations of untrimmed videos from a small set of Action Herald Networks (AherNet).
Understanding AdamW through Proximal Methods and Scale-Freeness
TLDR
This paper shows how to re-interpret AdamW as an approximation of a proximal gradient method, which takes advantage of the closed-form proximal mapping of the regularizer instead of only utilizing its gradient information as in Adam- ℓ 2 .
PreDet: Large-scale weakly supervised pre-training for detection
TLDR
This work proposes a new large-scale pre-training strategy for detection, where noisy class labels are available for all images, but not bounding-boxes, and designs a task that forces bounding boxes with high-overlap to have similar representations in different views of an image, compared to non-overlapping boxes.
...
...

References

SHOWING 1-10 OF 54 REFERENCES
LSDA: Large Scale Detection through Adaptation
TLDR
This paper proposes Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors.
Large Scale Semi-Supervised Object Detection Using Visual and Semantic Knowledge Transfer
TLDR
Strong evidence is found that visual similarity and semantic relatedness are complementary for the task, and when combined notably improve detection, achieving state-of-the-art detection performance in a semi-supervised setting.
R-FCN-3000 at 30fps: Decoupling Detection and Classification
TLDR
It is shown that the objectness learned by R-FCN-3000 generalizes to novel classes and the performance increases with the number of training object classes - supporting the hypothesis that it is possible to learn a universal objectness detector.
Zero-Shot Object Detection
TLDR
The problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training, is introduced and the problems associated with selecting a background class are discussed and motivate two background-aware approaches for learning robust detectors.
Zero-Annotation Object Detection with Web Knowledge Transfer
TLDR
This work proposes an object detection method that does not require any form of human annotation on target tasks, by exploiting freely available web images, and introduces a multi-instance multi-label domain adaption learning framework with two key innovations.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TLDR
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
You Only Look Once: Unified, Real-Time Object Detection
TLDR
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Weakly Supervised Object Localization with Progressive Domain Adaptation
TLDR
This paper addresses the problem of weakly supervised object localization where only image-level annotations are available for training by progressive domain adaptation with two main steps: classification adaptation and detection adaptation.
R-FCN: Object Detection via Region-based Fully Convolutional Networks
TLDR
This work presents region-based, fully convolutional networks for accurate and efficient object detection, and proposes position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection.
Feature Pyramid Networks for Object Detection
TLDR
This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
...
...