Spatial Sampling Network for Fast Scene Understanding

@article{Mazzini2019SpatialSN,
  title={Spatial Sampling Network for Fast Scene Understanding},
  author={Davide Mazzini and Raimondo Schettini},
  journal={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  year={2019},
  pages={1286-1296}
}
  • Davide Mazzini, R. Schettini
  • Published 22 May 2019
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
We propose a network architecture to perform efficient scene understanding. This work presents three main novelties: the first is a module named Improved Guided Upsampling Module that can replace in toto the decoder part in common semantic segmentation networks. Our second contribution is the introduction of a new module based on spatial sampling to perform Instance Segmentation. It provides a very fast instance segmentation needing only a simple post-processing step at inference time. Finally… Expand
Training Efficient Semantic Segmentation CNNs on Multiple Datasets
TLDR
This work investigates a simple approach to modify semantic segmentation CNNs in order to train them on multiple datasets with heterogeneous groundtruths, and shows that the networks are trainable with the implemented method even though it highlights the limit of current efficient architectures when dealing withheterogeneous and large datasets. Expand
A Survey on Deep Learning Methods for Semantic Image Segmentation in Real-Time
  • G. Takos
  • Computer Science, Mathematics
  • ArXiv
  • 2020
TLDR
This work provides a comprehensive analysis of state-of-the-art deep learning architectures in image segmentation and an extensive list of techniques to achieve fast inference and computational efficiency. Expand
MiniNet: An Efficient Semantic Segmentation ConvNet for Real-Time Robotic Applications
TLDR
A novel architecture, MiniNet-v2, an enhanced version of MiniNet is developed, built considering the best option depending on CPU or GPU availability, which reaches comparable accuracy to the state-of-the-art models but uses less memory and computational resources. Expand
Progressive Knowledge-Embedded Unified Perceptual Parsing for Scene Understanding
  • Wenbo Zheng, Lan Yan, Fei-Yue Wang, Chao Gou
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2021
TLDR
A novel progressive knowledge-embedded representation learning framework that incorporates different level knowledge graphs into the learning of networks at corresponding level that demonstrates the superiority of the proposed framework over other existing state-of-the-art methods. Expand
Implicit Integration of Superpixel Segmentation into Fully Convolutional Networks
TLDR
This paper proposes a way to implicitly integrate a superpixel scheme into CNNs, which makes it easy to use superpixels with CNNs in an end-to-end fashion, and preserves detailed information such as object boundaries in the form ofsuperpixels even when the model contains downsampling layers. Expand
CenterPoly: real-time instance segmentation using bounding polygons
We present a novel method, called CenterPoly, for realtime instance segmentation using bounding polygons. We apply it to detect road users in dense urban environments, making it suitable forExpand
A Novel Approach to Data Augmentation for Pavement Distress Segmentation
TLDR
It is shown how, starting from few labeled images, it is possible to augment small and long-tail datasets by producing new images with the associated semantic layouts and a remarkable increase in performance, especially with low cardinality classes, when CNNs are trained on the augmented datasets with respect to original datasets. Expand
Scene Semantic Recognition Based on Modified Fuzzy C-Mean and Maximum Entropy Using Object-to-Object Relations
TLDR
A novel SSR framework that intelligently segments the locations of objects, generates a novel Bag of Features, and recognizes scenes via Maximum Entropy is proposed, applicable to various emerging technologies, such as augmented reality. Expand
Benchmarking algorithms for food localization and semantic segmentation
TLDR
This paper conducts extensive experiments to evaluate ten state-of-the-art segmentation algorithms on two tasks: food localization and semantic food segmentation. Expand
Analyzing and Recognizing Food in Constrained and Unconstrained Environments
Recently, Computer Vision based image analysis techniques have attracted a lot of attention because they are used to develop automatic dietary monitoring applications. Food recognition is a quiteExpand
...
1
2
...

References

SHOWING 1-10 OF 67 REFERENCES
Fast Scene Understanding for Autonomous Driving
TLDR
This paper presents a real-time efficient implementation based on ENet that solves three autonomous driving related tasks at once: semantic scene segmentation, instance segmentation and monocular depth estimation. Expand
Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation
TLDR
A novel convolutional network named Efficient Dense modules with Asymmetric convolution (EDANet) is proposed, which employs an asymmetric Convolution structure and incorporates dilated convolution and dense connectivity to achieve high efficiency at low computational cost and model size. Expand
Guided Upsampling Network for Real-Time Semantic Segmentation
TLDR
A Neural Network named Guided Upsampling Network which consists of a multiresolution architecture that jointly exploits high-resolution and large context information that can be plugged into any existing encoder-decoder architecture with little modifications and low additional computation cost is proposed. Expand
Pyramid Scene Parsing Network
TLDR
This paper exploits the capability of global context information by different-region-based context aggregation through the pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet) to produce good quality results on the scene parsing task. Expand
ERFNet: Efficient Residual Factorized ConvNet for Real-Time Semantic Segmentation
TLDR
A deep architecture that is able to run in real time while providing accurate semantic segmentation, and a novel layer that uses residual connections and factorized convolutions in order to remain efficient while retaining remarkable accuracy is proposed. Expand
Bottom-up Instance Segmentation using Deep Higher-Order CRFs
TLDR
This work focuses on the task of Instance Segmentation which recognises and localises objects down to a pixel level and uses a deep neural network trained for semantic segmentation to reason about instances. Expand
ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time
TLDR
ContextNet is proposed, a new deep neural network architecture which builds on factorized convolution, network compression and pyramid representation to produce competitive semantic segmentation in real-time with low memory requirement. Expand
Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation
TLDR
A multi-resolution reconstruction architecture based on a Laplacian pyramid that uses skip connections from higher resolution feature maps and multiplicative gating to successively refine segment boundaries reconstructed from lower-resolution maps is described. Expand
Instance-Aware Semantic Segmentation via Multi-task Network Cascades
  • Jifeng Dai, Kaiming He, Jian Sun
  • Computer Science
  • 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2016
TLDR
This paper presents Multitask Network Cascades for instance-aware semantic segmentation, which consists of three networks, respectively differentiating instances, estimating masks, and categorizing objects, and develops an algorithm for the nontrivial end-to-end training of this causal, cascaded structure. Expand
RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation
TLDR
RefineNet is presented, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections and introduces chained residual pooling, which captures rich background context in an efficient manner. Expand
...
1
2
3
4
5
...