• Publications
  • Influence
Saliency detection by multi-context deep learning
TLDR
This paper proposes a multi-context deep learning framework for salient object detection that employs deep Convolutional Neural Networks to model saliency of objects in images and investigates different pre-training strategies to provide a better initialization for training the deep neural networks.
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
TLDR
Extensive experiments demonstrate that the proposed stacked generative adversarial networks significantly outperform other state-of-the-art methods in generating photo-realistic images.
Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification
TLDR
This work presents a pipeline for learning deep feature representations from multiple domains with Convolutional Neural Networks with CNNs and proposes a Domain Guided Dropout algorithm to improve the feature learning procedure.
Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification
TLDR
Both the orientation invariant feature embedding and the spatio-temporal regularization achieve considerable improvements in the vehicle Re-identification problem.
Learning Spatial Regularization with Image-Level Supervisions for Multi-label Image Classification
TLDR
Analysis of the learned SRN model demonstrates that it can effectively capture both semantic and spatial relations of labels for improving classification performance, and significantly outperforms state-of-the-arts and has strong generalization capability.
Learning Feature Pyramids for Human Pose Estimation
TLDR
This work designs a Pyramid Residual Module (PRMs) to enhance the invariance in scales of DCNNs and provides theoretic derivation to extend the current weight initialization scheme to multi-branch network structures.
Object Detection from Video Tubelets with Convolutional Neural Networks
TLDR
This work introduces a complete framework for the VID task based on still-image object detection and general object tracking, and proposes a temporal convolution network to incorporate temporal information to regularize the detection results and shows its effectiveness for the task.
3D Human Pose Estimation in the Wild by Adversarial Learning
TLDR
An adversarial learning framework is proposed, which distills the 3D human pose structures learned from the fully annotated dataset to in-the-wild images with only 2D pose annotations and designs a geometric descriptor, which computes the pairwise relative locations and distances between body joints, as a new information source for the discriminator.
Online Multi-object Tracking Using CNN-Based Single Object Tracker with Spatial-Temporal Attention Mechanism
TLDR
A CNN-based framework for online MOT that utilizes the merits of single object trackers in adapting appearance models and searching for target in the next frame and introduces spatial-temporal attention mechanism (STAM) to handle the drift caused by occlusion and interaction among targets.
Understanding pedestrian behaviors from stationary crowd groups
TLDR
A novel model is proposed for pedestrian behavior modeling by including stationary crowd groups as a key component and the effectiveness of the proposed model is demonstrated through multiple applications, including walking path prediction, destination prediction, personality classification, and abnormal event detection.
...
1
2
3
4
5
...