• Publications
  • Influence
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
Designing accurate and efficient ConvNets for mobile devices is challenging because the design space is combinatorially large. Due to this, previous neural architecture search (NAS) methods areExpand
  • 290
  • 65
  • PDF
Single Image 3D Interpreter Network
Understanding 3D object structure from a single image is an important but difficult task in computer vision, mostly due to the lack of 3D object annotations in real images. Previous work tackles thisExpand
  • 222
  • 27
  • PDF
Building Generalizable Agents with a Realistic and Rich 3D Environment
Towards bridging the gap between machine and human intelligence, it is of utmost importance to introduce environments that are visually realistic and rich in content. In such environments, one canExpand
  • 171
  • 19
  • PDF
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation function, i.e., $f(\mathbf{Z}; \mathbf{w}, \mathbf{a}) = \sum_jExpand
  • 158
  • 18
  • PDF
Simple Baseline for Visual Question Answering
We describe a very simple bag-of-words baseline for visual question answering. This baseline concatenates the word features from the question and CNN features from the image to predict the answer.Expand
  • 208
  • 17
  • PDF
Seeing through water: Image restoration using model-based tracking
A video sequence of an underwater scene taken from above the water surface suffers from severe distortions due to water fluctuations. In this paper, we simultaneously estimate the shape of the waterExpand
  • 81
  • 14
  • PDF
Algorithmic Framework for Model-based Reinforcement Learning with Theoretical Guarantees
Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL. However, the theoretical understanding of such methods hasExpand
  • 56
  • 14
  • PDF
Exploring the Spatial Hierarchy of Mixture Models for Human Pose Estimation
Human pose estimation requires a versatile yet well-constrained spatial model for grouping locally ambiguous parts together to produce a globally consistent hypothesis. Previous works either useExpand
  • 124
  • 13
  • PDF
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
In this paper, we explore theoretical properties of training a two-layered ReLU network $g(\mathbf{x}; \mathbf{w}) = \sum_{j=1}^K \sigma(\mathbf{w}_j^T\mathbf{x})$ with centered $d$-dimensionalExpand
  • 135
  • 12
  • PDF
EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking
Digital photo management is becoming indispensable for the explosively growing family photo albums due to the rapid popularization of digital cameras and mobile phone cameras. In an effective photoExpand
  • 149
  • 10
  • PDF