Butterfly Transform: An Efficient FFT Based Neural Architecture Design

@article{AlizadehVahid2020ButterflyTA,
  title={Butterfly Transform: An Efficient FFT Based Neural Architecture Design},
  author={Keivan Alizadeh-Vahid and Ali Farhadi and Mohammad Rastegari},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={12021-12030}
}
In this paper, we introduce the Butterfly Transform (BFT), a light weight channel fusion method that reduces the computational complexity of point-wise convolutions from O(n^2) of conventional solutions to O(n log n) with respect to the number of channels while improving the accuracy of the networks under the same range of FLOPs. [...] Key Result Notably, the ShuffleNet-V2+BFT outperforms state-of-the-art architecture search methods MNasNet \cite{tan2018mnasnet} and FBNet \cite{wu2018fbnet}. We also show that…Expand
MicroNet: Improving Image Recognition with Extremely Low FLOPs
TLDR
This paper found that two factors, sparse connectivity and dynamic activation function, are effective to improve the accuracy and proposes micro-factorized convolution, which factorizes a convolution matrix into low rank matrices, to integrate sparse connectivity into convolution. Expand
Cyclic Sparsely Connected Architectures for Compact Deep Convolutional Neural Networks
TLDR
This article proposes cyclic sparsely connected (CSC) architectures, and shows that both standard convolution and depthwise convolution layers are special cases of the CSC layers, whose mathematical function can be unified into one single formulation and whose hardware implementation can be carried out under one arithmetic logic component. Expand
Universal Cyclic Sparsely Connected Layers for Compact Convolutional Neural Network Design
  • 2020
Model size and computation complexity of deep convolutional neural networks (DCNNs) are two major factors governing their throughput and energy efficiency when deployed to hardware for inference.Expand
MicroNet: Towards Image Recognition with Extremely Low FLOPs
TLDR
This paper proposes Micro-Factorized convolution to factorize both pointwise and depthwise convolutions into low rank matrices and proposes a new activation function, named Dynamic Shift-Max, to improve the non-linearity via maxing out multiple dynamic fusions between an input feature map and its circular channel shift. Expand
Substituting Convolutions for Neural Network Compression
TLDR
This paper proposes a simple compression technique that is general, easy to apply, and requires minimal tuning, and is able to leverage a number of methods that have been developed as efficient alternatives to fully-connected layers for pointwise substitution for Pareto-optimal benefits in efficiency/accuracy. Expand
Rethinking Neural Operations for Diverse Tasks
TLDR
This work introduces a search space of neural operations called XD-Operations that mimic the inductive bias of standard multichannel convolutions while being much more expressive: it is proved that XD-operations include many named operations across several application areas. Expand
Sparsifying Networks via Subdifferential Inclusion
TLDR
This article proposes an iterative optimization algorithm to induce sparsity whose convergence is guaranteed, and shows that the proposed approach is valid for a broad class of activation functions (ReLU, sigmoid, softmax). Expand
QuicK-means: accelerating inference for K-means by learning fast transforms
TLDR
An efficient extension of K-means is introduced that rests on the idea of expressing the matrix of the cluster centroids as a product of sparse matrices, and it is proposed to learn such a factorization during the Lloyd’s training procedure. Expand
Robust Computationally-Efficient Wireless Emitter Classification Using Autoencoders and Convolutional Neural Networks
TLDR
This work proposes a novel emitter classification solution consisting of a Denoising Autoencoder (DAE), which feeds a CNN classifier with lower dimensionality, denoised representations of channel-corrupted spectrograms, which outperforms a wide range of standalone CNNs and other machine learning models while requiring significantly less computational resources. Expand
Mobile-Former: Bridging MobileNet and Transformer
TLDR
The proposed light-weight cross attention to model the bridge in Mobile-Former is not only computationally efficient, but also has more representation power, outperforming MobileNetV3 at low FLOP regime from 25M to 500M FLOPs on ImageNet classification. Expand
...
1
2
...

References

SHOWING 1-10 OF 74 REFERENCES
Butterfly-Net: Optimal Function Representation Based on Convolutional Neural Networks
TLDR
Butterfly-net, a low-complexity CNN with structured and sparse across-channel connections, which aims at an optimal hierarchical function representation of the input signal, outperforms the hard-coded Butterfly-net and achieves similar accuracy as the trained CNN but with much less parameters. Expand
Speeding up Convolutional Neural Networks with Low Rank Expansions
TLDR
Two simple schemes for drastically speeding up convolutional neural networks are presented, achieved by exploiting cross-channel or filter redundancy to construct a low rank basis of filters that are rank-1 in the spatial domain. Expand
LCNN: Lookup-Based Convolutional Neural Network
TLDR
This paper introduces LCNN, a lookup-based convolutional neural network that encodes convolutions by few lookups to a dictionary that is trained to cover the space of weights in CNNs and shows the benefits of LCNN in few-shot learning and few-iteration learning, two crucial aspects of on-device training of deep learning models. Expand
CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices
  • Caiwen Ding, Siyu Liao, +13 authors Bo Yuan
  • Computer Science
  • 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
  • 2017
TLDR
The CirCNN architecture is proposed, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales, etc) and FFT can be used as the key computing kernel which ensures universal and small-footprint implementations. Expand
ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network
We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual and sequential data. Our network uses group point-wise and depth-wiseExpand
Building Efficient Deep Neural Networks With Unitary Group Convolutions
TLDR
It is experimentally demonstrated that dense unitary transforms can outperform channel shuffling in DNN accuracy, and the proposed HadaNet, a UGConv network using Hadamard transforms achieves similar accuracy to circulant networks with lower computation complexity, and better accuracy than ShuffleNets with the same number of parameters and floating-point multiplies. Expand
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
TLDR
This work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, and derives several practical guidelines for efficient network design, called ShuffleNet V2. Expand
Dynamic Channel Pruning: Feature Boosting and Suppression
TLDR
This paper proposes feature boosting and suppression (FBS), a new method to predictively amplify salient convolutional channels and skip unimportant ones at run-time, and compares FBS to a range of existing channel pruning and dynamic execution schemes and demonstrates large improvements on ImageNet classification. Expand
Learning Structured Sparsity in Deep Neural Networks
TLDR
The results show that for CIFAR-10, regularization on layer depth can reduce 20 layers of a Deep Residual Network to 18 layers while improve the accuracy from 91.25% to 92.60%, which is still slightly higher than that of original ResNet with 32 layers. Expand
Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations
TLDR
This work introduces a parameterization of divide-and-conquer methods that can automatically learn an efficient algorithm for many important transforms, and can be incorporated as a lightweight replacement of generic matrices in machine learning pipelines to learn efficient and compressible transformations. Expand
...
1
2
3
4
5
...