Learn More
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully(More)
We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default(More)
We introduce the SCAPE method (Shape Completion and Animation for PEople)---a data-driven method for building a human shape model that spans variation in both subject shape and pose. The method is based on a representation that incorporates both articulated and non-rigid deformations. We learn a <i>pose deformation model</i> that derives the non-rigid(More)
We address the problem of segmenting 3D scan data into objects or object classes. Our segmentation framework is based on a subclass of Markov random fields (MRFs) which support efficient graph-cut inference. The MRF models incorporate a large set of diverse features and enforce the preference that adjacent scan points have the same classification label. We(More)
Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012). The winning model on the localization sub-task was a network that predicts a single bounding box and a confidence score for each object category in(More)
Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting. The performance depends critically on the amount of labeled examples, and in current practice the labels are assumed to be unambiguous and accurate. However, this assumption often(More)
We present an unsupervised algorithm for registering 3D surface scans of an object undergoing significant deformations. Our algorithm does not use markers, nor does it assume prior knowledge about object shape, the dynamics of its deformation, or scan alignment. The algorithm registers two meshes by optimizing a joint probabilistic model over all(More)
In recent years, there has been a lot of interest in the database community in mining time series data. Surprisingly, little work has been done on verifying which measures are most suitable for mining of a given class of data sets. Such work is of crucial importance, since it enables us to identify similarity measures which are useful in a given context and(More)
Current high-quality object detection approaches use the same scheme: salience-based object proposal methods followed by post-classification using deep convolutional features. This spurred recent research in improving object proposal methods [18, 32, 15, 11, 2]. However, domain agnostic proposal generation has the principal drawback that the proposals come(More)