DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
- Jeff Donahue, Yangqing Jia, Trevor Darrell
- Computer ScienceInternational Conference on Machine Learning
- 5 October 2013
DeCAF, an open-source implementation of deep convolutional activation features, along with all associated network parameters, are released to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.
Compact Bilinear Pooling
- Yang Gao, Oscar Beijbom, Ning Zhang, Trevor Darrell
- Computer ScienceComputer Vision and Pattern Recognition
- 19 November 2015
Two compact bilinear representations are proposed with the same discriminative power as the full bil inear representation but with only a few thousand dimensions allowing back-propagation of classification errors enabling an end-to-end optimization of the visual recognition system.
Part-Based R-CNNs for Fine-Grained Category Detection
- Ning Zhang, Jeff Donahue, Ross B. Girshick, Trevor Darrell
- Computer ScienceEuropean Conference on Computer Vision
- 14 July 2014
This work proposes a model for fine-grained categorization that overcomes limitations by leveraging deep convolutional features computed on bottom-up region proposals, and learns whole-object and part detectors, enforces learned geometric constraints between them, and predicts a fine- grained category from a pose-normalized representation.
PANDA: Pose Aligned Networks for Deep Attribute Modeling
- Ning Zhang, Manohar Paluri, M. Ranzato, Trevor Darrell, Lubomir D. Bourdev
- Computer ScienceIEEE Conference on Computer Vision and Pattern…
- 21 November 2013
A new method which combines part-based models and deep learning by training pose-normalized CNNs for inferring human attributes from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion is proposed.
Beyond frontal faces: Improving Person Recognition using multiple cues
- Ning Zhang, Manohar Paluri, Yaniv Taigman, R. Fergus, Lubomir D. Bourdev
- Computer ScienceComputer Vision and Pattern Recognition
- 22 January 2015
The Pose Invariant PErson Recognition (PIPER) method is proposed, which accumulates the cues of poselet-level person recognizers trained by deep convolutional networks to discount for the pose variations, combined with a face recognizer and a global recognizer.
Visual Attention Model for Name Tagging in Multimodal Social Media
- Di Lu, Leonardo Neves, Vitor R. Carvalho, Ning Zhang, Heng Ji
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2018
This paper creates two new multimodal datasets and proposes a novel model architecture based on Visual Attention that not only provides deeper visual understanding on the decisions of the model, but also significantly outperforms other state-of-the-art baseline methods for this task.
Multi-view to Novel View: Synthesizing Novel Views With Self-learned Confidence
- Shao-Hua Sun, Minyoung Huh, Yuan-Hong Liao, Ning Zhang, Joseph J. Lim
- Computer ScienceEuropean Conference on Computer Vision
- 8 September 2018
This paper proposes an end-to-end trainable framework that learns to exploit multiple viewpoints to synthesize a novel view without any 3D supervision, and introduces a self-learned confidence aggregation mechanism.
Do Convnets Learn Correspondence?
- Jonathan Long, Ning Zhang, Trevor Darrell
- Computer ScienceNIPS
- 4 November 2014
Evidence is presented that convnet features localize at a much finer scale than their receptive field sizes, that they can be used to perform intraclass aligment as well as conventional hand-engineered features, and that they outperform conventional features in keypoint prediction on objects from PASCAL VOC 2011.
Birdlets: Subordinate categorization using volumetric primitives and pose-normalized appearance
- Ryan Farrell, Om Oza, Ning Zhang, Vlad I. Morariu, Trevor Darrell, L. Davis
- Computer ScienceVision
- 6 November 2011
An approach for subordinate categorization in vision is developed, focusing on an avian domain due to the fine-grained structure of the category taxonomy for this domain, and a pose-normalized appearance model based on a volumetric poselet scheme is explored.
Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction
- Ning Zhang, Ryan Farrell, Forrest N. Iandola, Trevor Darrell
- Computer ScienceIEEE International Conference on Computer Vision
- 1 December 2013
This paper proposes two pose-normalized descriptors based on computationally-efficient deformable part models based on strongly-supervised DPM parts, which enable pooling across pose and viewpoint, in turn facilitating tasks such as fine-grained recognition and attribute prediction.
...
...