• Publications
  • Influence
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
This work proposes a differentiable neural architecture search (DNAS) framework that uses gradient-based methods to optimize ConvNet architectures, avoiding enumerating and training individual architectures separately as in previous methods. Expand
Building Generalizable Agents with a Realistic and Rich 3D Environment
House3D is built, a rich, extensible and efficient environment that contains 45,622 human-designed 3D scenes of houses, equipped with a diverse set of fully labeled 3D objects, textures and scene layouts, based on the SUNCG dataset and an emphasis on semantic-level generalization. Expand
Single Image 3D Interpreter Network
This work proposes 3D INterpreter Network (3D-INN), an end-to-end framework which sequentially estimates 2D keypoint heatmaps and 3D object structure, trained on both real 2D-annotated images and synthetic 3D data, and achieves state-of-the-art performance on both 2DKeypoint estimation and3D structure recovery. Expand
Algorithmic Framework for Model-based Reinforcement Learning with Theoretical Guarantees
A novel algorithmic framework for designing and analyzing model-based RL algorithms with theoretical guarantees is introduced and a meta-algorithm with a theoretical guarantee of monotone improvement to a local maximum of the expected reward is designed. Expand
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation, i.e., $f(\mathbf{Z}, \mathbf{w}, \mathbf{a}) = \sum_jExpand
Simple Baseline for Visual Question Answering
A very simple bag-of-words baseline for visual question answering that concatenates the word features from the question and CNN features fromThe image to predict the answer. Expand
Semantic Amodal Segmentation
A detailed image annotation that captures information beyond the visible pixels and requires complex reasoning about full scene structure is proposed, and it is shown that the proposed full scene annotation is surprisingly consistent between annotators, including for regions and edges. Expand
Seeing through water: Image restoration using model-based tracking
A novel tracking technique is presented that is designed specifically for water surfaces and addresses two unique challenges—the absence of an object model or template and the presence of complex appearance changes in the scene due to water fluctuation. Expand
Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation
A new hierarchical spatial model that can capture an exponential number of poses with a compact mixture representation on each part using latent nodes so that it can represent high-order spatial relationship among parts with exact inference. Expand
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
It is proved that critical points outside the hyperplane spanned by the teacher parameters ("out-of-plane") are not isolated and form manifolds, and characterize in-plane critical-point-free regions for two ReLU case. Expand