BCNet: Searching for Network Width with Bilaterally Coupled Network

@article{Su2021BCNetSF,
  title={BCNet: Searching for Network Width with Bilaterally Coupled Network},
  author={Xiu Su and Shan You and Fei Wang and Chen Qian and Changshui Zhang and Chang Xu},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={2175-2184}
}
  • Xiu SuShan You Chang Xu
  • Published 21 May 2021
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Searching for a more compact network width recently serves as an effective way of channel pruning for the deployment of convolutional neural networks (CNNs) under hardware constraints. To fulfill the searching, a one-shot supernet is usually leveraged to efficiently evaluate the performance w.r.t. different network widths. However, current methods mainly follow a unilaterally augmented (UA) principle for the evaluation of each width, which induces the training unfairness of channels in supernet… 

ViTAS: Vision Transformer Architecture Search

This paper develops a new cyclic weight-sharing mechanism for token embeddings of the ViTs, which enables each channel could more evenly contribute to all candidate architectures and proposes identity shifting to alleviate the many-to-one issue in superformer.

Soft Masking for Cost-Constrained Channel Pruning

This work proposes Soft Masking for cost-constrained Channel Pruning (SMCP) to allow pruned channels to adaptively return to the network while simultaneously pruning towards a target cost constraint.

Vision Transformer Architecture Search

This paper designs a new effective yet efficient weight sharing paradigm for ViTs, such that architectures with different token embedding, sequence size, number of heads, width, and depth can be derived from a single super-transformer.

Sufficient Vision Transformer

In this paper, the Sufficiency-Blocks (S-Blocks) are proposed to be applied across the depth of Suf-ViT to disentangle and discard task-irrelevant information accurately and a Sufficient-Reduction Loss (SRLoss) is formulated leveraging the concept of Mutual Information (MI) that enablesSuf- ViT to extract more reliable sufficient representations by removing task-IRrelevant information.

ScaleNet: Searching for the Model to Scale

Experimental results show that the searched architectures by the proposed ScaleNet with various FLOPs budgets can outperform the referred methods on various datasets, including ImageNet-1k and fine-tuning tasks.

Weakly Supervised Contrastive Learning

  • Mingkai ZhengFei Wang Chang Xu
  • Computer Science
    2021 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2021
A weakly supervised contrastive learning framework (WCL) based on two projection heads, one of which will perform the regular instance discrimination task, and the other head will use a graph-based method to explore similar samples and generate a weak label to pull the similar images closer.

Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning

A novel sub-network search and fine-tuning method that is named Ensemble Knowledge Guidance (EKG), which experimentally proves that the fluctuation of the loss landscape is an effective metric to evaluate the potential performance.

HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D Medical Image Segmentation using HyperNet

This work introduces a HyperNet to assist super-net training by incorporating architecture topology information, and shows that HyperSegNAS yields better performing and more intuitive architectures compared to the previous state-of-the-art segmentation networks; it can quickly and accurately find good architecture candidates under different computing constraints.

Patch Slimming for Efficient Vision Transformers

  • Yehui TangKai Han D. Tao
  • Computer Science
    2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2022
A novel patch slimming approach that discards useless patches in a topdown paradigm is presented that can significantly reduce the computational costs of vision transformers without affecting their performances.

K-shot NAS: Learnable Weight-Sharing for NAS with K-shot Supernets

K-shot NAS significantly improves the evaluation accuracy of paths and thus brings in impressive performance improvements, and can be iteratively trained and extended to the channel dimension.

References

SHOWING 1-10 OF 49 REFERENCES

Locally Free Weight Sharing for Network Width Search

This paper proposes a loCAlly FrEe weight sharing strategy (CafeNet), which can be trained stochastically and get optimized within a min-min strategy and can further boost the benchmark NAS network EfficientNet-B0 by 0.41% via searching its width more delicately.

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.

Learning Efficient Convolutional Networks through Network Slimming

The approach is called network slimming, which takes wide and large networks as input models, but during training insignificant channels are automatically identified and pruned afterwards, yielding thin and compact models with comparable accuracy.

Data-Driven Sparse Structure Selection for Deep Neural Networks

A simple and effective framework to learn and prune deep models in an end-to-end manner by adding sparsity regularizations on factors, and solving the optimization problem by a modified stochastic Accelerated Proximal Gradient (APG) method.

AutoSlim: Towards One-Shot Architecture Search for Channel Numbers

A simple and one-shot solution to set channel numbers in a neural network to achieve better accuracy under constrained resources (e.g., FLOPs, latency, memory footprint or model size) is presented.

Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration

Unlike previous methods, FPGM compresses CNN models by pruning filters with redundancy, rather than those with“relatively less” importance, and when applied to two image classification benchmarks, the method validates its usefulness and strengths.

Model Rubik's Cube: Twisting Resolution, Depth and Width for TinyNets

A tiny formula for downsizing neural architectures through a series of smaller models derived from the EfficientNet-B0 with the FLOPs constraint is summarized, observing that resolution and depth are more important than width for tiny networks.

Channel Pruning for Accelerating Very Deep Neural Networks

  • Yihui HeXiangyu ZhangJian Sun
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
This paper proposes an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction, and generalizes this algorithm to multi-layer and multi-branch cases.

Reborn Filters: Pruning Convolutional Neural Networks with Limited Data

This paper proposes to use all original filters to directly develop new compact filters, named reborn filters, so that all useful structure priors in the original filters can be well preserved into the pruned networks, alleviating the performance drop accordingly.

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

The proposed Soft Filter Pruning (SFP) method enables the pruned filters to be updated when training the model after pruning, which has two advantages over previous works: larger model capacity and less dependence on the pretrained model.