• Publications
  • Influence
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. Expand
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Expand
The Secrets of Salient Object Segmentation
An extensive evaluation of fixation prediction and salient object segmentation algorithms as well as statistics of major datasets identifies serious design flaws of existing salient object benchmarks and proposes a new high quality dataset that offers both fixation and salient objects segmentation ground-truth. Expand
Progressive Neural Architecture Search
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionaryExpand
The Concave-Convex Procedure
It is proved that all expectation-maximization algorithms and classes of Legendre minimization and variational bounding algorithms can be reexpressed in terms of CCCP. Expand
Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multiband Image Segmentation
A novel statistical and variational approach to image segmentation based on a new algorithm, named region competition, derived by minimizing a generalized Bayes/minimum description length (MDL) criterion using the variational principle is presented. Expand
The Role of Context for Object Detection and Semantic Segmentation in the Wild
A novel deformable part-based model is proposed, which exploits both local context around each candidate detection as well as global context at the level of the scene, which significantly helps in detecting objects at all scales. Expand
Generation and Comprehension of Unambiguous Object Descriptions
We propose a method that can generate an unambiguous description (known as a referring expression) of a specific object or region in an image, and which can also comprehend or interpret such anExpand
Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)
The m-RNN model directly models the probability distribution of generating a word given previous words and an image, and achieves significant performance improvement over the state-of-the-art methods which directly optimize the ranking objective function for retrieval. Expand
Detect What You Can: Detecting and Representing Objects Using Holistic Models and Body Parts
This work proposes a novel approach to handle large deformations and partial occlusions in animals in terms of body parts, and applies it to the six animal categories in the PASCAL VOC dataset and shows that it significantly improves state-of-the-art (by 4.1% AP) and provides a richer representation for objects. Expand