ImageNet Large Scale Visual Recognition Challenge
- Olga Russakovsky, Jia Deng, Li Fei-Fei
- Computer ScienceInternational Journal of Computer Vision
- 1 September 2014
The creation of this benchmark dataset and the advances in object recognition that have been possible as a result are described, and the state-of-the-art computer vision accuracy with human accuracy is compared.
SSD: Single Shot MultiBox Detector
- W. Liu, Dragomir Anguelov, A. Berg
- Computer ScienceEuropean Conference on Computer Vision
- 8 December 2015
The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.
DSSD : Deconvolutional Single Shot Detector
- Cheng-Yang Fu, W. Liu, A. Ranga, A. Tyagi, A. Berg
- Computer ScienceArXiv
- 23 January 2017
This paper combines a state-of-the-art classifier with a fast detection framework and augments SSD+Residual-101 with deconvolution layers to introduce additional large-scale context in object detection and improve accuracy, especially for small objects.
Attribute and simile classifiers for face verification
- Neeraj Kumar, A. Berg, P. Belhumeur, S. Nayar
- Computer ScienceIEEE International Conference on Computer Vision
- 1 September 2009
Two novel methods for face verification using binary classifiers trained to recognize the presence or absence of describable aspects of visual appearance and a new data set of real-world images of public figures acquired from the internet.
Modeling Context in Referring Expressions
- Licheng Yu, Patrick Poirson, Shan Yang, A. Berg, Tamara L. Berg
- Computer ScienceEuropean Conference on Computer Vision
- 31 July 2016
This work focuses on incorporating better measures of visual context into referring expression models and finds that visual comparison to other objects within an image helps improve performance significantly.
Classification using intersection kernel support vector machines is efficient
- Subhransu Maji, A. Berg, Jitendra Malik
- Computer ScienceIEEE Conference on Computer Vision and Pattern…
- 23 June 2008
It is shown that one can build histogram intersection kernel SVMs (IKSVMs) with runtime complexity of the classifier logarithmic in the number of support vectors as opposed to linear for the standard approach.
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition
- Hao Zhang, A. Berg, M. Maire, Jitendra Malik
- Computer ScienceComputer Vision and Pattern Recognition
- 17 June 2006
This work considers visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories and proposes a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice.
ParseNet: Looking Wider to See Better
- Wei Liu, Andrew Rabinovich, A. Berg
- Computer ScienceArXiv
- 15 June 2015
This work presents a technique for adding global context to deep convolutional networks for semantic segmentation, and achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines.
Recognizing action at a distance
- Alexei A. Efros, A. Berg, Greg Mori, Jitendra Malik
- Computer ScienceProceedings Ninth IEEE International Conference…
- 13 October 2003
A novel motion descriptor based on optical flow measurements in a spatiotemporal volume for each stabilized human figure is introduced, and an associated similarity measure to be used in a nearest-neighbor framework is introduced.
Max-margin additive classifiers for detection
- Subhransu Maji, A. Berg
- Computer ScienceIEEE International Conference on Computer Vision
- 1 September 2009
A pair of fast training algorithms for piece-wise linear classifiers, which can approximate arbitrary additive models, are presented, which are trained in a max-margin framework and significantly outperform linear classifier on a variety of vision datasets.
...
...