• Corpus ID: 236635148

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

  title={Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation},
  author={Bowen Zhang and Yifan Liu and Zhi Tian and Chunhua Shen},
Semantic segmentation requires per-pixel prediction for a given image. Typically, the output resolution of a segmentation network is severely reduced due to the downsampling operations in the CNN backbone. Most previous methods employ upsampling decoders to recover the spatial resolution. Various decoders were designed in the literature. Here, we propose a novel decoder, termed dynamic neural representational decoder (NRD), which is simple yet significantly more efficient. As each location on the… 
4 Citations

Figures and Tables from this paper

SegViT: Semantic Segmentation with Plain Vision Transformers

This work explores the capability of plain Vision Transformers for semantic segmentation, and proposes the Attention-to-Mask (ATM) module, in which the similarity maps between a set of learnable class tokens and the spatial feature maps are transferred to the segmentation masks.

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

Experimental results demonstrate that the proposed TopFormer method significantly outperforms CNN- and ViT-based networks across several semantic segmentation datasets and achieves a good trade-off between accuracy and latency.

Deep Semantic Segmentation of Trees Using Multispectral Images

Forests can be efficiently monitored by automatic semantic segmentation of trees using satellite and/or aerial images. Still, several challenges can make the problem difficult, including the varying

The devil is in the labels: Semantic segmentation from sentences

An approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting, and that the method enables impressive performance improvements in downstream applications, including depth estimation and instance segmentation.



Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation

This work proposes a data-dependent upsampling (DUpsampling) to replace bilinear, which takes advantages of the redundancy in the label space of semantic segmentation and is able to recover the pixel-wise prediction from low-resolution outputs of CNNs.

EfficientFCN: Holistically-guided Decoding for Semantic Segmentation

The EfficientFCN is proposed, whose backbone is a common ImageNet pre-trained network without any dilated convolution, and achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost.

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

This work extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries and applies the depthwise separable convolution to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network.

Fully convolutional networks for semantic segmentation

The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation

This work shows how to improve semantic segmentation through the use of contextual information, specifically, ' patch-patch' context between image regions, and 'patch-background' context, and formulate Conditional Random Fields with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches.

Understanding Convolution for Semantic Segmentation

DUC is designed to generate pixel-level prediction, which is able to capture and decode more detailed information that is generally missing in bilinear upsampling, and a hybrid dilated convolution (HDC) framework in the encoding phase is proposed.

Learning Deconvolution Network for Semantic Segmentation

A novel semantic segmentation algorithm by learning a deep deconvolution network on top of the convolutional layers adopted from VGG 16-layer net, which demonstrates outstanding performance in PASCAL VOC 2012 dataset.

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models.

Higher Order Conditional Random Fields in Deep Neural Networks

Two types of higher order potentials, based on object detections and superpixels, can be included in a CRF embedded within a deep network to allow inference with the differentiable mean field algorithm and are designed to achieve state-of-the-art segmentation performance on the PASCAL VOC benchmark.

Context Encoding for Semantic Segmentation

The proposed Context Encoding Module significantly improves semantic segmentation results with only marginal extra computation cost over FCN, and can improve the feature representation of relatively shallow networks for the image classification on CIFAR-10 dataset.