Contextual Priming and Feedback for Faster R-CNN

@inproceedings{Shrivastava2016ContextualPA,
  title={Contextual Priming and Feedback for Faster R-CNN},
  author={Abhinav Shrivastava and Abhinav Kumar Gupta},
  booktitle={European Conference on Computer Vision},
  year={2016}
}
The field of object detection has seen dramatic performance improvements in the last few years. [] Key Method Specifically, we propose to: (a) augment Faster R-CNN with a semantic segmentation network; (b) use segmentation for top-down contextual priming; (c) use segmentation to provide top-down iterative feedback using two stage training. Our results indicate that all three contributions improve the performance on object detection, semantic segmentation and region proposal generation.

Top-Down Feedback for Crowd Counting Convolutional Neural Network

This work proposes top-down feedback to correct the initial prediction of the CNN for crowd counting and evaluates the performance of the model on all major crowd datasets and shows the effectiveness of top- down feedback.

ASAN: Self-Attending and Semantic Activating Network towards Better Object Detection

The so called self-attending and semantic activating network (ASAN) achieves better accuracy than two-stage methods and is able to fulfil real- time processing.

Feedback Pyramid Attention Networks for Single Image Super-Resolution

A novel feedback connection structure is developed to enhance low-level feature expression with high-level information and introduce a pyramid non-local structure to model global contextual information in different scales and improve the discriminative representation of the network.

Improve object detection via a multi-feature and multi-task CNN model

An object detection system based on standard Fast R-CNN object detection branch and DeepLap semantic segmentation branch that multi-feature aggregates hierarchical features for more finer feature maps to detect objects at multiple scales and a novel overlap loss function is used for bounding box regression to improve localization.

Recent progresses on object detection: a brief review

A simple but comprehensive survey of the recent improvements in object detection in the era of deep learning, which includes some other progress like real-time object detectors and works borrowing the idea from RNN and GAN.

Single shot object detection with top-down refinement

This paper proposes a single shot object detector with top-down refinement, denoted as SSD-TDR, which not only runs at high speed and also detects multi-scale objects accurately and achieves competitive results both in speed and accuracy compared to other VGG16 based networks.

A Context Aware Deep Learning Architecture for Object Detection

This work proposes an architecture aimed at learning contextual relationships and improving the precision of existing CNN-based object detectors by implementing a fully convolutional architecture.

Single-Shot Refinement Neural Network for Object Detection

This paper proposes a novel single-shot based detector, called RefineDet, that achieves better accuracy than two-stage methods and maintains comparable efficiency of one- stage methods.

Learning efficient single stage pedestrian detection by squeeze-and-excitation network

A novel framework which is able to perform pedestrian detection by not only considering local features but also by incorporating global information into features to make them more discriminative for this task is proposed.
...

References

SHOWING 1-10 OF 94 REFERENCES

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.

Unrolling Loopy Top-Down Semantic Feedback in Convolutional Deep Networks

A novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing and how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods.

Simultaneous Detection and Segmentation

This work builds on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN), introducing a novel architecture tailored for SDS, and uses category-specific, top-down figure-ground predictions to refine the bottom-up proposals.

Hypercolumns for object segmentation and fine-grained localization

Using hypercolumns as pixel descriptors, this work defines the hypercolumn at a pixel as the vector of activations of all CNN units above that pixel, and shows results on three fine-grained localization tasks: simultaneous detection and segmentation, and keypoint localization.

ParseNet: Looking Wider to See Better

This work presents a technique for adding global context to deep convolutional networks for semantic segmentation, and achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks

The Inside-Outside Net (ION), an object detector that exploits information both inside and outside the region of interest, provides strong evidence that context and multi-scale representations improve small object detection.

Fully convolutional networks for semantic segmentation

The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model

An object detection system that relies on a multi-region deep convolutional neural network that also encodes semantic segmentation-aware features that aims at capturing a diverse set of discriminative appearance factors and exhibits localization sensitivity that is essential for accurate object localization.

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

This integrated framework for using Convolutional Networks for classification, localization and detection is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 and obtained very competitive results for the detection and classifications tasks.
...