• Corpus ID: 52156503

OCNet: Object Context Network for Scene Parsing

@article{Yuan2018OCNetOC,
  title={OCNet: Object Context Network for Scene Parsing},
  author={Yuhui Yuan and Jingdong Wang},
  journal={ArXiv},
  year={2018},
  volume={abs/1809.00916}
}
In this paper, we address the problem of scene parsing with deep learning and focus on the context aggregation strategy for robust segmentation. [] Key MethodOur implementation, inspired by the self-attention approach, consists of two steps: (i) compute the similarities between each pixel and all the pixels, forming a so-called object context map for each pixel served as a surrogate for the true object context, and (ii) represent the pixel by aggregating the features of all the pixels weighted by the…

Figures and Tables from this paper

Object-Contextual Representations for Semantic Segmentation

This paper addresses the semantic segmentation problem with a focus on the context aggregation strategy, and presents a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class.

Point Set Attention Network For Semantic Segmentation

A Point Set Attention Network (PSANet) is proposed for improving self-attention mechanism by correcting the noisy pixels to contribute mutual improvement between pixels of the same class.

Pyramidal region context module for semantic segmentation

This work designs the Pyramidal Region Context Module (PRCM) to handle the neighbor relationship of multi-scale regions, and introduces an end-to-end segmentation network - PRNet.

Mining Contextual Information Beyond Image for Semantic Segmentation

This paper proposes to mine the contextual information beyond individual images to further augment the pixel representations and proposes a representation consistent learning strategy to make the classification head better address intra-class compactness and inter-class dispersion.

Scene Segmentation With Dual Relation-Aware Attention Network

A Dual Relation-aware Attention Network (DRANet) is proposed to handle the task of scene segmentation and designs two types of compact attention modules, which model the contextual dependencies in spatial and channel dimensions, respectively.

Interaction via Bi-directional Graph of Semantic Region Affinity for Scene Parsing

This work proposes a region-level loss that evaluates all pixels in a region as a whole and motivates the network to learn the exclusive regional feature per class and achieves new state-of-the-art segmentation results on PASCAL-Context, ADE20K, and COCO-Stuff consistently.

Dual Attention Network for Scene Segmentation

New state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff dataset is achieved without using coarse data.

Semantic Segmentation via Pixel-to-Center Similarity Calculation

This paper rethink semantic segmentation from a perspective of similarity between pixels and class centers, and proposes a Class Center Similarity layer (CCS layer) to address the above-mentioned challenges by generating adaptive class centers conditioned on different scenes and supervising the similarities between class centers.

ISNet: Integrate Image-Level and Semantic-Level Context for Semantic Segmentation

Co-occurrent visual pattern makes aggregating contextual information a common paradigm to enhance the pixel representation for semantic image segmentation. The existing approaches focus on modeling

Attention-based dual context aggregation for image semantic segmentation

The Dual Context Aggregation Module (DCM) is presented, which splits into two attention modules to obtain dense contextual information via modeling relations between positions and channels, and is constructed as the Dual Context aggregation Network (DCNet).
...

References

SHOWING 1-10 OF 74 REFERENCES

Object-Contextual Representations for Semantic Segmentation

This paper addresses the semantic segmentation problem with a focus on the context aggregation strategy, and presents a simple yet effective approach, object-contextual representations, characterizing a pixel by exploiting the representation of the corresponding object class.

Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships

This work presents a so-called Structure Inference Network (SIN), a detector that incorporates into a typical detection framework with a graphical model which aims to infer object state and comprehensive experiments indicate that scene context and object relationships truly improve the performance of object detection with more desirable and reasonable outputs.

Dual Attention Network for Scene Segmentation

New state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff dataset is achieved without using coarse data.

Semantic Correlation Promoted Shape-Variant Context for Segmentation

This work proposes a novel paired convolution to infer the semantic correlation of the pair and based on that to generate a shape mask, of which the receptive field is controlled by the shape mask that varies with the appearance of input.

Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation

A novel context contrasted local feature that not only leverages the informative context but also spotlights the local information in contrast to the context is proposed that greatly improves the parsing performance.

Microsoft COCO: Common Objects in Context

We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene

CCNet: Criss-Cross Attention for Semantic Segmentation

  • Zilong HuangXinggang Wang Wenyu Liu
  • Computer Science
    2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
This work proposes a Criss-Cross Network (CCNet) for obtaining contextual information in a more effective and efficient way and achieves the mIoU score of 81.4 and 45.22 on Cityscapes test set and ADE20K validation set, respectively, which are the new state-of-the-art results.

Adaptive Pyramid Context Network for Semantic Segmentation

This paper introduces three desirable properties of context features in segmentation task and finds that Global-guided Local Affinity (GLA) can play a vital role in constructing effective context features, while this property has been largely ignored in previous works.

Recurrent Scene Parsing with Perspective Understanding in the Loop

This work proposes a depth-aware gating module that adaptively selects the pooling field size in a convolutional network architecture according to the object scale so that small details are preserved for distant objects while larger receptive fields are used for those nearby.

Context Encoding for Semantic Segmentation

The proposed Context Encoding Module significantly improves semantic segmentation results with only marginal extra computation cost over FCN, and can improve the feature representation of relatively shallow networks for the image classification on CIFAR-10 dataset.
...