• Publications
  • Influence
Pyramid Scene Parsing Network
TLDR
This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task.
Hierarchical Saliency Detection
TLDR
This work tackles saliency detection from a scale point of view and proposes a multi-layer approach to analyze saliency cues, by finding saliency values optimally in a tree model.
Path Aggregation Network for Instance Segmentation
TLDR
Path Aggregation Network (PANet) is proposed aiming at boosting information flow in proposal-based instance segmentation framework by enhancing the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation.
Abnormal Event Detection at 150 FPS in MATLAB
TLDR
An efficient sparse combination learning framework based on inherent redundancy of video structures achieves decent performance in the detection phase without compromising result quality and reaches high detection rates on benchmark datasets at a speed of 140-150 frames per second on average.
ICNet for Real-Time Semantic Segmentation on High-Resolution Images
TLDR
An image cascade network (ICNet) that incorporates multi-resolution branches under proper label guidance to address the challenging task of real-time semantic segmentation is proposed and in-depth analysis of the framework is provided.
Context Encoding for Semantic Segmentation
TLDR
The proposed Context Encoding Module significantly improves semantic segmentation results with only marginal extra computation cost over FCN, and can improve the feature representation of relatively shallow networks for the image classification on CIFAR-10 dataset.
Hybrid Task Cascade for Instance Segmentation
TLDR
This work proposes a new framework, Hybrid Task Cascade (HTC), which differs in two important aspects: (1) instead of performing cascaded refinement on these two tasks separately, it interweaves them for a joint multi-stage processing; (2) it adopts a fully convolutional branch to provide spatial context, which can help distinguishing hard foreground from cluttered background.
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
TLDR
An adaptive geometric consistency loss is proposed to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively and achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.
Spatial As Deep: Spatial CNN for Traffic Scene Understanding
TLDR
This paper proposes Spatial CNN (SCNN), which generalizes traditional deep layer-by-layer convolutions to slice-byslice convolutions within feature maps, thus enabling message passings between pixels across rows and columns in a layer.
Libra R-CNN: Towards Balanced Learning for Object Detection
TLDR
Libra R-CNN is proposed, a simple but effective framework towards balanced learning for object detection that integrates three novel components: IoU-balanced sampling, balanced feature pyramid, and balanced L1 loss, respectively for reducing the imbalance at sample, feature, and objective level.
...
1
2
3
4
5
...