Learn More
Recently significant performance improvement in face detection was made possible by deeply trained convolutional networks. In this report, a novel approach for training state-of-the-art face detector is described. The key is to exploit the idea of hard negative mining and iteratively update the Faster R-CNN based face detector with the hard negatives(More)
In this paper, we propose a RGB-D indoor scene recognition method that has mainly two advantages as compared to existing methods. First, by training object detectors using RGB-D images and recognizing their spatial interrelationships, we not only achieve better object localization accuracy than using RGB images alone, but also obtain details as to how the(More)
Deploying femtocell networks achieves great spatial reuse at the price of severe interference from concurrent transmissions. To mitigate the downlink interference from femtocell base stations (FBSs) to nearby macrocell users (MUEs), power control is employed to enable FBSs to dynamically reconfigure their power allocation based on information obtained from(More)
Spontaneous facial expression recognition is significantly more challenging than recognizing posed ones. We focus on two issues that are still under-addressed in this area. First, due to the inherent subtlety, the geometric and appearance features of spontaneous expressions tend to overlap with each other, making it hard for classifiers to find effective(More)
Of increasing interest to the computer vision community is to recognize egocentric actions. Conceptually, an egocentric action is largely identifiable by the states of hands and objects. For example, “drinking soda” is essentially composed of two sequential states where one first “takes up the soda can”, then “drinks from(More)
We present a new topic model, named supervised Mixed Membership Stochastic Block Model, to recognize scene categories. In contrast to previous topic model based scene recognition, its key advantage originates from the joint modeling of the latent topics of adjacent visual words to promote the visual coherency of the latent topics. To ensure that an image is(More)
It is desirable to allow packets with the same source and destination to take more than one possible path. This facility can be used to ease congestion and overcome node failures. In this paper, we design and implement a k-multipath routing algorithm that allows a given source node send samples of data to a given sink node in a large scale sensor networks.(More)
In this paper, we propose a facial expression classification method using metric learning-based k-nearest neighbor voting. To achieve accurate classification of a facial expression from frontal face image, we first learn a distance metric structure from training data that characterizes the feature space pattern, then use this metric to retrieve nearest(More)
In this paper, we propose a novel kernel function for recognizing objects in RGB-D egocentric videos. In order to effectively exploit the varied object appearance in a video, we take a set-based recognition approach and represent the target object using the set of frames contained in the video. Our kernel function measures the similarity of two sets by the(More)