• Publications
  • Influence
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual RecognitionExpand
SSD: Single Shot MultiBox Detector
TLDR
The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component. Expand
SCAPE: shape completion and animation of people
TLDR
The SCAPE method is capable of constructing a high-quality animated surface model of a moving person, with realistic muscle deformation, using just a single static scan and a marker motion capture sequence of the person. Expand
SCAPE: shape completion and animation of people
TLDR
The SCAPE method is capable of constructing a high-quality animated surface model of a moving person, with realistic muscle deformation, using just a single static scan and a marker motion capture sequence of the person. Expand
3D Bounding Box Estimation Using Deep Learning and Geometry
TLDR
Although conceptually simple, this method outperforms more complex and computationally expensive approaches that leverage semantic segmentation, instance level segmentation and flat ground priors and produces state of the art results for 3D viewpoint estimation on the Pascal 3D+ dataset. Expand
Training Deep Neural Networks on Noisy Labels with Bootstrapping
TLDR
A generic way to handle noisy and incomplete labeling by augmenting the prediction objective with a notion of consistency is proposed, which considers a prediction consistent if the same prediction is made given similar percepts, where the notion of similarity is between deep network features computed from the input data. Expand
Scalability in Perception for Autonomous Driving: Waymo Open Dataset
TLDR
This work introduces a new large scale, high quality, diverse dataset, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies, and studies the effects of dataset size and generalization across geographies on 3D detection methods. Expand
Scalable Object Detection Using Deep Neural Networks
TLDR
This work proposes a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest. Expand
Discriminative learning of Markov random fields for segmentation of 3D scan data
TLDR
This work addresses the problem of segmenting 3D scan data into objects or object classes by using a recently proposed maximum-margin framework to discriminatively train the model from a set of labeled scans and automatically learn the relative importance of the features for the segmentation task. Expand
Google Street View: Capturing the World at Street Level
TLDR
A team of Google researchers describes the technical challenges involved in capturing, processing, and serving street-level imagery on a global scale. Expand
...
1
2
3
4
5
...