Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

@article{Ren2015FasterRT,
  title={Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks},
  author={Shaoqing Ren and Kaiming He and Ross B. Girshick and Jian Sun},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2015},
  volume={39},
  pages={1137-1149}
}
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. [] Key Method An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection.

Figures and Tables from this paper

R-FCN: Object Detection via Region-based Fully Convolutional Networks

TLDR
This work presents region-based, fully convolutional networks for accurate and efficient object detection, and proposes position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection.

Revisiting Faster R-CNN: A Deeper Look at Region Proposal Network

TLDR
A novel object detection network is proposed to address problems in Faster R-CNN, including coarseness of feature maps for accurate localization, fixed-window feature extraction in RPN and insensitivity for small scale objects.

R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

TLDR
This paper involves Global Context Module to improve the classification score maps by adopting large, separable convolutional kernels and introduces a new pooling method to better extract scores from the score maps, by using row-wise or column-wise max pooling.

BackNet: An Enhanced Backbone Network for Accurate Detection of Objects with Large Scale Variations

TLDR
This work proposes a robust network: BackNet, which can be integrated as a backbone into any two-stage detector, and evaluates the performance of BackNet-Faster RCNN on MS COCO dataset and shows that the proposed method outperforms five contemporary methods.

Relief R-CNN: Utilizing Convolutional Features for Fast Object Detection

TLDR
It is suggested that the value discrepancies among features in deep convolutional feature maps contain plenty of useful spatial information, and a simple approach to extract the information for fast region proposal generation in testing is proposed.

Rich Features and Precise Localization with Region Proposal Network for Object Detection

TLDR
This paper designs a new strategy for generating region proposals and proposes a new localization method for object detection that achieves the best recall and object detection accuracy.

Multi-scale Region Proposal Network Trained by Multi-domain Learning for Visual Object Tracking

This paper presents a multi-scale region proposal network (RPN) for visual object tracking, inspired by Faster R-CNN and Yolo detectors which adopt an RPN to significantly speed up the detection time

Atrous Faster R-CNN for Small Scale Object Detection

  • Tongfan GuanHao Zhu
  • Computer Science
    2017 2nd International Conference on Multimedia and Image Processing (ICMIP)
  • 2017
TLDR
This paper proposes a unified deep neural network building upon the prominent Faster R-CNN framework that achieves superior performance to the state of the arts, especially for small scale objects on PASCAL object detection challenge dataset.

Real-Time Object Detection With Reduced Region Proposal Network via Multi-Feature Concatenation

TLDR
This paper analyzes a widely used two-stage architecture called Faster R-CNN to improve the inference time and achieve real-time object detection without compromising on accuracy, and proposes a reduced region proposal network (RRPN) with dilated convolution and concatenation of multi-scale features.

Weakly Supervised Region Proposal Network and Object Detection

TLDR
This paper proposes a weakly supervised region proposal network which is trained using only image-level annotations and achieves the state-of-the-art performance for WSOD with performance gain of about \(3\%\) on average.
...

References

SHOWING 1-10 OF 45 REFERENCES

Fast R-CNN

  • Ross B. Girshick
  • Computer Science, Environmental Science
    2015 IEEE International Conference on Computer Vision (ICCV)
  • 2015
This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposals using deep

R-CNN minus R

TLDR
This paper designs and evaluates a detector that uses a trivial region generation scheme, constant for each image, and results in an excellent and fast detector that does not require to process an image with algorithms other than the CNN itself.

Scalable Object Detection Using Deep Neural Networks

TLDR
This work proposes a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest.

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

TLDR
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

Object Detection Networks on Convolutional Feature Maps

TLDR
It is shown by experiments that despite the effective ResNets and Faster R-CNN systems, the design of NoCs is an essential element for the 1st-place winning entries in ImageNet and MS COCO challenges 2015.

Convolutional feature masking for joint object and stuff segmentation

  • Jifeng DaiKaiming HeJian Sun
  • Computer Science
    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2015
TLDR
This paper proposes a joint method to handle objects and “stuff” (e.g., grass, sky, water) in the same framework and presents state-of-the-art results on benchmarks of PASCAL VOC and new PASCal-CONTEXT.

Learning to Segment Object Candidates

TLDR
A new way to generate object proposals is proposed, introducing an approach based on a discriminative convolutional network that obtains substantially higher object recall using fewer proposals and is able to generalize to unseen categories it has not seen during training.

DeePM: A Deep Part-Based Model for Object Detection and Semantic Part Localization

TLDR
This paper annotates semantic parts for all 20 object categories on the PASCAL VOC 2012 dataset, which provides information on object pose, occlusion, viewpoint and functionality and presents an end-to-end Object-Part R-CNN which learns an implicit feature representation for jointly mapping an image ROI to the object and part bounding boxes.

Instance-Aware Semantic Segmentation via Multi-task Network Cascades

  • Jifeng DaiKaiming HeJian Sun
  • Computer Science
    2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
TLDR
This paper presents Multitask Network Cascades for instance-aware semantic segmentation, which consists of three networks, respectively differentiating instances, estimating masks, and categorizing objects, and develops an algorithm for the nontrivial end-to-end training of this causal, cascaded structure.

Scalable, High-Quality Object Detection

TLDR
It is demonstrated that learning-based proposal methods can effectively match the performance of hand-engineered methods while allowing for very efficient runtime-quality trade-offs.