Semantic Object Parsing with Graph LSTM

@inproceedings{Liang2016SemanticOP,
  title={Semantic Object Parsing with Graph LSTM},
  author={Xiaodan Liang and Xiaohui Shen and Jiashi Feng and Liang Lin and Shuicheng Yan},
  booktitle={ECCV},
  year={2016}
}
By taking the semantic object parsing task as an exemplar application scenario, we propose the Graph Long Short-Term Memory (Graph LSTM) network, which is the generalization of LSTM from sequential data or multi-dimensional data to general graph-structured data. Particularly, instead of evenly and fixedly dividing an image to pixels or patches in existing multi-dimensional LSTM structures (e.g., Row, Grid and Diagonal LSTMs), we take each arbitrary-shaped superpixel as a semantically consistent… Expand
Dynamic-Structured Semantic Propagation Network
TLDR
A Dynamic-Structured Semantic Propagation Network (DSSPN) is proposed that builds a semantic neuron graph by explicitly incorporating the semantic concept hierarchy into network construction and demonstrates the superiority of the DSSPN over state-of-the-art segmentation models. Expand
Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection
TLDR
A deep Variation-structured Re-inforcement Learning (VRL) framework is proposed to sequentially discover object relationships and attributes in the whole image, and an ambiguity-aware object mining scheme is used to resolve semantic ambiguity among object categories that the object detector fails to distinguish. Expand
Graphonomy: Universal Image Parsing via Graph Reasoning and Transfer
TLDR
This work proposes a graph reasoning and transfer learning framework, named Graphonomy, which incorporates human knowledge and label taxonomy into the intermediate graph representation learning beyond local convolutions, and applies it to two relevant but different image understanding research topics: human parsing and panoptic segmentation. Expand
Fast scene labeling via structural inference
TLDR
This paper proposes a bi-directional recurrent network to model the information flow along the parent-child path via structural inference in a fast LSTM scene labeling network, and it outperforms state-of-the-art methods on different benchmarks. Expand
Progressively diffused networks for semantic visual parsing
TLDR
This work proposes Progressively Diffused Networks (PDNs) to deal with complex semantic parsing tasks and demonstrates the effectiveness of PDNs to substantially improve the performances of existing LSTM-based models. Expand
Progressively Diffused Networks for Semantic Image Segmentation
TLDR
The Progressively Diffused Networks framework demonstrates the effectiveness to substantially improve the performances over the popular existing neural network models, and achieves state-of-the-art on ImageNet Parsing for large scale semantic segmentation. Expand
Multi-scale Context Intertwining for Semantic Segmentation
TLDR
This work proposes a novel scheme for aggregating features from different scales, which it refers to as Multi-Scale Context Intertwining (MSCI), which merge pairs of feature maps in a bidirectional and recurrent fashion, via connections between two LSTM chains. Expand
Relational graph neural network for situation recognition
TLDR
Experimental results show that the proposed RGNN outperforms the state-of-the-art methods on verb and value metrics, and demonstrates better relationships between the activity and the objects. Expand
Semi-supervised hierarchical semantic object parsing
TLDR
This work focuses on the task of instance segmentation and parsing which recognizes and localizes objects down to a pixel level base on deep CNN, and incorporates a Conditional Random Field with end-to-end trainable piecewise order potentials based on object parsing outputs. Expand
Exploring and Exploiting the Hierarchical Structure of a Scene for Scene Graph Generation
TLDR
A novel neural network model is used to construct a hierarchical structure whose leaf nodes correspond to objects depicted in the image, and a message is passed along the estimated structure on the fly to maintain global consistency. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 57 REFERENCES
Semantic Object Parsing with Local-Global Long Short-Term Memory
TLDR
A novel deep Local-Global Long Short-Term Memory architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into the feature learning over all pixel positions and demonstrates the significant superiority of this LG-LSTM over other state-of-the-art methods. Expand
Semantic Image Segmentation via Deep Parsing Network
TLDR
This paper addresses semantic image segmentation by incorporating rich information into Markov Random Field (MRF), including high-order relations and mixture of label contexts by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass. Expand
Scene labeling with LSTM recurrent neural networks
TLDR
The approach, which has a much lower computational complexity than prior methods, achieved state-of-the-art performance over the Stanford Background and the SIFT Flow datasets and the ability to visualize feature maps from each layer supports the hypothesis that LSTM networks are overall suited for image processing tasks. Expand
Joint Object and Part Segmentation Using Deep Learned Potentials
TLDR
This paper proposes a joint solution that tackles semantic object and part segmentation simultaneously, in which higher object-level context is provided to guide part segmentsation, and more detailed part-level localization is utilized to refine object segmentation. Expand
Human Parsing with Contextualized Convolutional Neural Network
In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co-CNN) architecture, which well integrates the cross-layer context, global image-levelExpand
Fully Convolutional Networks for Semantic Segmentation
TLDR
It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Expand
Grid Long Short-Term Memory
TLDR
The Grid LSTM is used to define a novel two-dimensional translation model, the Reencoder, and it is shown that it outperforms a phrase-based reference system on a Chinese-to-English translation task. Expand
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
TLDR
This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Expand
Attention to Scale: Scale-Aware Semantic Image Segmentation
TLDR
An attention mechanism that learns to softly weight the multi-scale features at each pixel location is proposed, which not only outperforms averageand max-pooling, but allows us to diagnostically visualize the importance of features at different positions and scales. Expand
Parsing Semantic Parts of Cars Using Graphical Models and Segment Appearance Consistency
TLDR
This paper addresses the problem of semantic part parsing (segmentation) of cars as a landmark identification problem, where a set of landmarks specifies the boundaries of the parts and proposes a novel mixture of graphical models, which dynamically couples the landmarks to a hierarchy of segments. Expand
...
1
2
3
4
5
...