Fast, Accurate Detection of 100,000 Object Classes on a Single Machine

@article{Dean2013FastAD,
  title={Fast, Accurate Detection of 100,000 Object Classes on a Single Machine},
  author={Thomas L. Dean and Mark A. Ruzon and Mark E. Segal and Jonathon Shlens and Sudheendra Vijayanarasimhan and Jay Yagnik},
  journal={2013 IEEE Conference on Computer Vision and Pattern Recognition},
  year={2013},
  pages={1814-1821}
}
Many object detection systems are constrained by the time required to convolve a target image with a bank of filters that code for different aspects of an object's appearance, such as the presence of component parts. We exploit locality-sensitive hashing to replace the dot-product kernel operator in the convolution with a fixed number of hash-table probes that effectively sample all of the filter responses in time independent of the size of the filter bank. To show the effectiveness of the… 

Figures from this paper

Fast Template Evaluation with Vector Quantization
TLDR
A method that achieves a substantial end-to-end speedup over the best current methods, without loss of accuracy, is described, a combination of approximating scores by vector quantizing feature windows and a number of speedup techniques including cascade.
A coarse-to-fine approach for fast deformable object detection
TLDR
A multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements is proposed, yielding a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach of [9].
Scalable object detection by filter compression with regularized sparse coding
TLDR
A new method called Regularized Sparse Coding is developed which is designed to reconstruct filter functionality, which reconstructs the ability of filter to produce accurate score for classification and can reconstruct filters by minimize score map error, while sparse coding reconstructs filters by minimizing appearance error.
30Hz Object Detection with DPM V5
TLDR
An implementation of the Deformable Parts Model that operates in a user-defined time-frame that uses a variety of mechanism to trade-off speed against accuracy, and exploits a series of important speedup mechanisms.
You Only Look Once: Unified, Real-Time Object Detection
TLDR
Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Immediate, Scalable Object Category Detection
TLDR
A new image representation based on mid-level discriminative patches is designed to be suited to immediate object category detection and inverted file indexing and a fast method for spatial reranking images on their detections is demonstrated.
A scalable architecture for multi-class visual object detection
TLDR
This work proposes a digital accelerator architecture for a high-throughput, robust, scalable, and tunable visual object detection pipeline based on Histogram of Oriented Gradients (HOG) features and exposes design-time parameters that can take advantage of domain-specific knowledge while supporting tune-ability through run-time configurations.
Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection
TLDR
It is shown that one can learn a universal model of part “goodness” based on properties that can be computed from the filter itself, which will improve its detection performance on the PASCAL VOC data sets, while speeding up training by an order of magnitude.
Context Forest for Object Class Detection
TLDR
Context Forest is presented — a technique for predicting properties of the objects in an image based on its global appearance that is more accurate, fast and memory efficient than standard nearestneighbour techniques.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
Robust Real-time Object Detection
TLDR
A visual object detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described, with the introduction of a new image representation called the “Integral Image” which allows the features used by the detector to be computed very quickly.
A coarse-to-fine approach for fast deformable object detection
TLDR
A multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements is proposed, yielding a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach of [9].
Training-Free, Generic Object Detection Using Locally Adaptive Regression Kernels
  • H. Seo, P. Milanfar
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2010
TLDR
The proposed method operates using a single example of an object of interest to find similar matches, does not require prior knowledge about objects being sought, anddoes not require any preprocessing step or segmentation of a target image.
Multiple kernels for object detection
TLDR
This work uses multiple kernel learning of Varma and Ray (ICCV 2007) to learn an optimal combination of exponential χ2 kernels, each of which captures a different feature channel.
Segmentation as selective search for object recognition
TLDR
This work adapt segmentation as a selective search by reconsidering segmentation to generate many approximate locations over few and precise object delineations because an object whose location is never generated can not be recognised and appearance and immediate nearby context are most effective for object recognition.
Robust Real-Time Face Detection
TLDR
A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.
Multi-component Models for Object Detection
TLDR
This paper proposes a multi-component approach for object detection that forms visual clusters from the data that are tight in appearance and configuration spaces and allows the transfer of finer-grained semantic information from the components, such as keypoint location and segmentation masks.
Learning a category independent object detection cascade
TLDR
This work focuses on the first layers of a category independent object detection cascade in which a large number of windows from an objectness prior are sampled, and then discriminatively learn to filter these candidate windows by an order of magnitude.
Sparselet Models for Efficient Multiclass Object Detection
TLDR
An intermediate representation for deformable part models is developed and it is shown that this representation has favorable performance characteristics for multi-class problems when the number of classes is high and is well suited to a parallel implementation.
Histograms of oriented gradients for human detection
  • N. Dalal, B. Triggs
  • Computer Science
    2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)
  • 2005
TLDR
It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
...
1
2
3
...