DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation

@article{Zhang2021DCNASDC,
  title={DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation},
  author={Xiong Zhang and Hongmin Xu and Hong Mo and Jianchao Tan and Cheng Yang and Wenqi Ren},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={13951-13962}
}
  • Xiong Zhang, Hongmin Xu, Wenqi Ren
  • Published 26 March 2020
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Existing NAS methods for dense image prediction tasks usually compromise on restricted search space or search on proxy task to meet the achievable computational demands. To allow as wide as possible network architectures and avoid the gap between realistic and proxy setting, we propose a novel Densely Connected NAS (DCNAS) framework, which directly searches the optimal network structures for the multi-scale representations of visual information, over a large-scale target dataset without proxy… 

Figures and Tables from this paper

Edge-Preserving Guided Semantic Segmentation for VIPriors Challenge
TLDR
This work proposes edge-preserving guidance to obtain the extra prior information, to avoid the overfitting under small-scale training dataset, and demonstrates that the proposed method can achieve excellent performance under Smallscale training set, compared to state-of-the-art semantic segmentation techniques.
Learning Versatile Neural Architectures by Propagating Network Codes
TLDR
Network Coding Propagation (NCP), a novel “neural predictor”, which is able to predict an architecture’s performance in multiple datasets and tasks, enables a single architecture applicable to both image segmentation and video recognition.
Real-Time Segmentation Networks Should be Latency Aware
TLDR
It is argued that the commonly used performance metric of mean Intersection over Union (mIoU) does not fully capture the information required to estimate the true performance of these networks when they operate in ‘real-time’.
Image Segmentation Using Deep Learning: A Survey
TLDR
A comprehensive review of recent pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings are provided.
Weight-Sharing Neural Architecture Search: A Battle to Shrink the Optimization Gap
TLDR
A literature review on the application of NAS to computer vision problems is provided and existing approaches are summarized into several categories according to their efforts in bridging the gap.
A U-Net Based Approach for Automating Tribological Experiments
TLDR
This work utilizes convolutional neural networks (CNNs) to automate a common experimental setup whereby an endoscopic camera was used to measure the contact area between a rubber sample and a spherical counterpart, creating a much more efficient and versatile experimental setup.
Convolutional Embedding Makes Hierarchical Vision Transformer Stronger
TLDR
This paper profoundly explores how the macro architecture of the hybrid CNNs/ViTs enhances the performances of hierarchical ViTs, and systemically reveals how CE injects desirable inductive bias in ViTs.
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision
TLDR
This paper proposes a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff, and employs parametrized channel slimming by stepwise downward knowledge distillation during training.
Multi-Prior Learning via Neural Architecture Search for Blind Face Restoration
TLDR
This work proposes a Face Restoration Searching Network (FRSNet) to adaptively search the suitable feature extraction architecture within the authors' specified search space, which can directly contribute to the restoration quality and designs the Multiple Facial Prior Searching network (MFPSNet) with a multi-prior learning scheme.
RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks
TLDR
A global-to-local search scheme that exploits both global search to find the coarse combinations and local search to get the refined receptive field combinations further and an expectationguided iterative local search scheme to refine combinations effectively is proposed.
...
...

References

SHOWING 1-10 OF 95 REFERENCES
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
TLDR
This paper presents a network level search space that includes many popular designs, and develops a formulation that allows efficient gradient-based architecture search and demonstrates the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets.
SqueezeNAS: Fast Neural Architecture Search for Faster Semantic Segmentation
TLDR
This work presents what they believe to be the first proxyless hardware-aware search targeted for dense semantic segmentation on the Cityscapes semantic segmentsation dataset and demonstrates that significant performance gains are possible by utilizing NAS to find networks optimized for both the specific task and inference hardware.
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells
TLDR
This work focuses on dense per-pixel tasks, in particular, semantic image segmentation using fully convolutional networks, and relies on a progressive strategy that terminates non-promising architectures from being further trained, and on Polyak averaging coupled with knowledge distillation to speed-up the convergence.
Searching for Efficient Multi-Scale Architectures for Dense Image Prediction
TLDR
This work constructs a recursive search space for meta-learning techniques for dense image prediction focused on the tasks of scene parsing, person-part segmentation, and semantic image segmentation and demonstrates that even with efficient random search, this architecture can outperform human-invented architectures.
SparseMask: Differentiable Connectivity Learning for Dense Image Prediction
TLDR
This paper designs a densely connected network with learnable connections, named Fully Dense Network, which contains a large set of possible final connectivity structures, and employs gradient descent to search the optimal connectivity from the dense connections.
Fully Convolutional Networks for Semantic Segmentation
TLDR
It is shown that convolutional networks by themselves, trained end- to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation.
BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation
TLDR
A novel Bilateral Segmentation Network (BiSeNet) is proposed that makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets.
RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation
TLDR
RefineNet is presented, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections and introduces chained residual pooling, which captures rich background context in an efficient manner.
Dense Relation Network: Learning Consistent and Context-Aware Representation for Semantic Image Segmentation
TLDR
This paper proposes dense relation network (DRN) and context-restricted loss (CRL) to aggregate global and local information to make the best of global context.
Attention to Scale: Scale-Aware Semantic Image Segmentation
TLDR
An attention mechanism that learns to softly weight the multi-scale features at each pixel location is proposed, which not only outperforms averageand max-pooling, but allows us to diagnostically visualize the importance of features at different positions and scales.
...
...