Learn More
Face detection is a mature problem in computer vision. While diverse high performing face detectors have been proposed in the past, we present two surprising new top performance results. First, we show that a properly trained vanilla DPM reaches top performance, improving over commercial and research systems. Second, we show that a detector based on rigid(More)
Motivation: In weakly supervised object detection where only the presence or absence of an object category as a binary label is available for training, the common practice is to model the object location with latent variables and jointly learn them with the object appearance model [1, 5]. An ideal weakly supervised learning method for object detection is(More)
Figure 1: An illustration of our learning model: In the top row, we show clusters of objects and object parts that are simultaneously learned with the detectors during training. Our method encourages highly probable windows to be similar among them through the jointly learned clusters during training. The colored lines indicate similarity between windows(More)
In this paper we evaluate the quality of the activation layers of a convolutional neural network (CNN) for the generation of object proposals. We generate hypotheses in a sliding-window fashion over different activation layers and show that the final convolutional layers can find the object of interest with high recall but poor localization due to the(More)
This paper is focused on the automatic recognition of human events in static images. Popular techniques use knowledge of the human pose for inferring the action, and the most recent approaches tend to combine pose information with either knowledge of the scene or of the objects with which the human interacts. Our approach makes a step forward in this(More)
Learning the distribution of images in order to generate new samples is a challenging task due to the high dimen-sionality of the data and the highly non-linear relations that are involved. Nevertheless, some promising results have been reported in the literature recently, building on deep network architectures. In this work, we zoom in on a specific type(More)
We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection(More)
Cascading techniques are commonly used to speed-up the scan of an image for object detection. However, cascades of detectors are slow to train due to the high number of detectors and corresponding thresholds to learn. Furthermore, they do not use any prior knowledge about the scene structure to decide where to focus the search. To handle these problems, we(More)
This paper presents a framework for view-invariant action recognition in image sequences. Feature-based human detection becomes extremely challenging when the agent is being observed from different viewpoints. Besides, similar actions, such as walking and jogging, are hardly distinguishable by considering the human body as a whole. In this work, we have(More)