• Publications
  • Influence
Soft-NMS — Improving Object Detection with One Line of Code
Soft-NMS is proposed, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M and improves state-of-the-art in object detection from 39.8% to 40.9% with a single model.
An Analysis of Scale Invariance in Object Detection - SNIP
  • Bharat Singh, L. Davis
  • Computer Science
    IEEE/CVF Conference on Computer Vision and…
  • 22 November 2017
A novel training scheme called Scale Normalization for Image Pyramids (SNIP) is presented which selectively back-propagates the gradients of object instances of different sizes as a function of the image scale.
Training Neural Networks Without Gradients: A Scalable ADMM Approach
This paper explores an unconventional training method that uses alternating direction methods and Bregman iteration to train networks without gradient descent steps, and exhibits strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.
SNIPER: Efficient Multi-Scale Training
SNIPER brings training of instance level recognition tasks like object detection closer to the protocol for image classification and suggests that the commonly accepted guideline that it is important to train on high resolution images for instance level visual recognition tasks might not be correct.
A Multi-stream Bi-directional Recurrent Neural Network for Fine-Grained Action Detection
This paper presents a multi-stream bi-directional recurrent neural network for fine-grained action detection that significantly outperforms state-of-the-art action detection methods on both datasets.
Temporal Context Network for Activity Localization in Videos
A Temporal Context Network (TCN) for precise temporal localization of human activities which outperforms state-of-the-art methods on the ActivityNet dataset and the THU-MOS14 dataset.
Deception Detection in Videos
It is shown that predictions of high-level micro-expressions can be used as features for deception prediction, and surprisingly, IDT features, which have been widely used for action recognition, are also very good at predicting deception in videos.
R-FCN-3000 at 30fps: Decoupling Detection and Classification
It is shown that the objectness learned by R-FCN-3000 generalizes to novel classes and the performance increases with the number of training object classes - supporting the hypothesis that it is possible to learn a universal objectness detector.
Soft Sampling for Robust Object Detection
The robustness of object detection under the presence of missing annotations is studied, and it is observed that after dropping 30% of the annotations, the performance of CNN-based object detectors like Faster-RCNN only drops by 5% on the PASCAL VOC dataset.
Selecting Relevant Web Trained Concepts for Automated Event Retrieval
This work proposes an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval, and demonstrates large improvements over other vision based systems on the TRECVID MED 13 dataset.