SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

@article{Badrinarayanan2017SegNetAD,
  title={SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation},
  author={Vijay Badrinarayanan and Alex Kendall and Roberto Cipolla},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2017},
  volume={39},
  pages={2481-2495}
}
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1] . The role of the decoder network is to map the low resolution encoder feature maps to… 

Figures and Tables from this paper

Weakly-Supervised Learning of a Deep Convolutional Neural Networks for Semantic Segmentation
TLDR
A DCNNs model for generating the pixel-level labels using the image-level annotation based on the orthogonal non-negative matrix factorization (NMF) technology that outperforms some recently developed methods.
SegNetRes-CRF: A Deep Convolutional Encoder-Decoder Architecture for Semantic Image Segmentation
TLDR
This work proposes some modifications in the SegNet-Basic architecture by using a post-processing segmentation layer (using Conditional Random Fields) and by transferring high resolution features combined to the decoder network.
A Novel Upsampling and Context Convolution for Image Semantic Segmentation
TLDR
A novel dense upsampling convolution method based on a guided filter to effectively preserve the spatial information of the image in the network and a novel local context Convolution method that not only covers larger-scale objects in the scene but covers them densely for precise object boundary delineation.
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
TLDR
This work extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries and applies the depthwise separable convolution to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network.
Improvement in Accuracy and Speed of Image Semantic Segmentation via Convolution Neural Network Encoder-Decoder
TLDR
A convolutional neural network composed of encoder-decoder segments based on successful SegNet network that improves the speed and the accuracy compared to the popular networks such as SegNet and DeepLab.
Hybrid connection network for semantic segmentation
TLDR
This paper proposes a hybrid connection network architecture for semantic segmentation which consists of an encoder network for extracting different scale feature maps and a decoding network for recovering extracted feature maps with the resolution of the input image.
Squeeze-SegNet: a new fast deep convolutional neural network for semantic segmentation
TLDR
This work presents a new Deep fully Convolutional Neural Network for pixel-wise semantic segmentation which they call Squeeze-SegNet, based on Encoder-Decoder style and gets SegNet-level accuracy with less than 10 times fewer parameters than SegNet.
PPEDNet: Pyramid Pooling Encoder-Decoder Network for Real-Time Semantic Segmentation
TLDR
A pyramid pooling encoder-decoder network named PPEDNet is proposed for both better accuracy and faster processing speed and is evaluated on CamVid dataset, achieving 7.214% mIOU accuracy improvement while reducing 17.9% of the parameters compared with the state-of theart algorithm.
Gated Convolutional Neural Network for Semantic Segmentation in High-Resolution Images
TLDR
A gated convolutional neural network is proposed that achieves competitive segmentation accuracy on the public ISPRS 2D Semantic Labeling benchmark, which is challenging for segmentation by only using the RGB images.
Convolution-deconvolution architecture with the pyramid pooling module for semantic segmentation
TLDR
A novel and practical deep fully convolutional neural network architecture was introduced for semantic pixel-wise segmentation termed as P-DecovNet, which combines the Convolution-Deconvolution Neural Network architecture with the Pyramid Pooling Module.
...
...

References

SHOWING 1-10 OF 75 REFERENCES
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling
TLDR
The results show that SegNet achieves state-of-the-art performance even without use of additional cues such as depth, video frames or post-processing with CRF models.
Fully Convolutional Networks for Semantic Segmentation
TLDR
It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation.
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
TLDR
This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).
Conditional Random Fields as Recurrent Neural Networks
TLDR
A new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling is introduced, and top results are obtained on the challenging Pascal VOC 2012 segmentation benchmark.
Learning a Deep Convolutional Network for Image Super-Resolution
TLDR
This work proposes a deep learning method for single image super-resolution (SR) that directly learns an end-to-end mapping between the low/high-resolution images and shows that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network.
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding
TLDR
A practical system which is able to predict pixel-wise class labels with a measure of model uncertainty, and shows that modelling uncertainty improves segmentation performance by 2-3% across a number of state of the art architectures such as SegNet, FCN and Dilation Network, with no additional parametrisation.
Learning Hierarchical Features for Scene Labeling
TLDR
A method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel, alleviates the need for engineered features, and produces a powerful representation that captures texture, shape, and contextual information.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation
TLDR
The decoupled architecture enables the algorithm to learn classification and segmentation networks separately based on the training data with image-level and pixel-wise class labels, respectively, and facilitates to reduce search space for segmentation effectively by exploiting class-specific activation maps obtained from bridging layers.
U-Net: Convolutional Networks for Biomedical Image Segmentation
TLDR
It is shown that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
...
...