CS-R-FCN: Cross-Supervised Learning for Large-Scale Object Detection

@article{Guo2020CSRFCNCL,
  title={CS-R-FCN: Cross-Supervised Learning for Large-Scale Object Detection},
  author={Ye Guo and Yali Li and Shengjin Wang},
  journal={ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2020},
  pages={2553-2557}
}
  • Ye Guo, Yali Li, S. Wang
  • Published 2020
  • Computer Science
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Generic object detection is one of the most fundamental problems in computer vision, yet it is difficult to provide all the bounding-box-level annotations aiming at large-scale object detection for thousands of categories. In this paper, we present a novel cross-supervised learning pipeline for large-scale object detection, denoted as CS-R-FCN. First, we propose to utilize the data flow of image-level annotated images in the fully-supervised two-stage object detection framework, leading to… Expand
Grounded Situation Recognition
TLDR
A Joint Situation Localizer is proposed and it is found that jointly predicting situations and groundings with end-to-end training handily outperforms independent training on the entire grounding metric suite with relative gains between 8% and 32%. Expand

References

SHOWING 1-10 OF 11 REFERENCES
Visual and Semantic Knowledge Transfer for Large Scale Semi-Supervised Object Detection
TLDR
Strong evidence is found that visual similarity and semantic relatedness are complementary for the task, and when combined notably improve detection, achieving state-of-the-art detection performance in a semi-supervised setting. Expand
LSDA: Large Scale Detection through Adaptation
TLDR
This paper proposes Large Scale Detection through Adaptation (LSDA), an algorithm which learns the difference between the two tasks and transfers this knowledge to classifiers for categories without bounding box annotated data, turning them into detectors. Expand
R-FCN-3000 at 30fps: Decoupling Detection and Classification
TLDR
It is shown that the objectness learned by R-FCN-3000 generalizes to novel classes and the performance increases with the number of training object classes - supporting the hypothesis that it is possible to learn a universal objectness detector. Expand
R-FCN: Object Detection via Region-based Fully Convolutional Networks
TLDR
This work presents region-based, fully convolutional networks for accurate and efficient object detection, and proposes position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Expand
You Only Look Once: Unified, Real-Time Object Detection
TLDR
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork. Expand
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features. Expand
Microsoft COCO: Common Objects in Context
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of sceneExpand
YOLO9000: Better, Faster, Stronger
TLDR
YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories, is introduced and a method to jointly train on object detection and classification is proposed, both novel and drawn from prior work. Expand
Fast R-CNN
  • Ross B. Girshick
  • Computer Science
  • 2015 IEEE International Conference on Computer Vision (ICCV)
  • 2015
This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposals using deepExpand
WordNet: A Lexical Database for English
TLDR
WordNet1 provides a more effective combination of traditional lexicographic information and modern computing, and is an online lexical database designed for use under program control. Expand
...
1
2
...