• Publications
  • Influence
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TLDR
This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TLDR
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
TLDR
This work equips the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.
Face Alignment at 3000 FPS via Regressing Local Binary Features
TLDR
This paper presents a highly efficient, very accurate regression approach for face alignment that achieves the state-of-the-art results when tested on the current most challenging benchmarks.
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
TLDR
This work equips the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.
Joint Cascade Face Detection and Alignment
TLDR
The key idea is to combine face alignment with detection, observing that aligned face shapes provide better features for face classification and learns the two tasks jointly in the same cascade framework, by exploiting recent advances in face alignment.
Instance-Sensitive Fully Convolutional Networks
TLDR
This paper develops FCNs that are capable of proposing instance-level segment candidates that do not have any high-dimensional layer related to the mask resolution, but instead exploits image local coherence for estimating instances.
Object Detection Networks on Convolutional Feature Maps
TLDR
It is shown by experiments that despite the effective ResNets and Faster R-CNN systems, the design of NoCs is an essential element for the 1st-place winning entries in ImageNet and MS COCO challenges 2015.
...
...