Learn More
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive(More)
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet <xref ref-type="bibr" rid="ref1">[1]</xref> and Fast R-CNN <xref ref-type="bibr" rid="ref2">[2]</xref> have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this(More)
In this paper, we study the salient object detection problem for images. We formulate this problem as a binary labeling task where we separate the salient object from the background. We propose a set of novel features, including multiscale contrast, center-surround histogram, and color spatial distribution, to describe a salient object locally, regionally,(More)
In this paper, we propose a novel explicit image filter called guided filter. Derived from a local linear model, the guided filter computes the filtering output by considering the content of a guidance image, which can be the input image itself or another different image. The guided filter can be used as an edge-preserving smoothing operator like the(More)
Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational(More)
We present a very efficient, highly accurate, “Explicit Shape Regression” approach for face alignment. Unlike previous regression-based approaches, we directly learn a vectorial regression function to infer the whole facial shape (a set of facial landmarks) from the image and explicitly minimize the alignment errors over the training data. The inherent(More)
Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224<inline-formula><tex-math>$\times$ </tex-math><alternatives><inline-graphic xlink:type="simple" xlink:href="he-ieq1-2389824.gif"/></alternatives></inline-formula>224) input image. This requirement is &#x201C;artificial&#x201D; and may reduce the recognition accuracy for the(More)
Recent progresses in salient object detection have exploited the boundary prior, or background information, to assist other saliency cues such as contrast, achieving state-of-the-art results. However, their usage of boundary prior is very simple, fragile, and the integration with other cues is mostly heuristic. In this work, we present new methods to(More)