Trained Rank Pruning for Efficient Deep Neural Networks

@article{Xu2019TrainedRP,
  title={Trained Rank Pruning for Efficient Deep Neural Networks},
  author={Yuhui Xu and Yuxi Li and Shuai Zhang and Wei Wen and Botao Wang and Yingyong Qi and Yiran Chen and Weiyao Lin and Hongkai Xiong},
  journal={2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS)},
  year={2019},
  pages={14-17}
}
  • Yuhui Xu, Yuxi Li, +6 authors H. Xiong
  • Published 2019
  • Computer Science
  • 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition (EMC2-NIPS)
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low rank approximation and… Expand
TRP: Trained Rank Pruning for Efficient Deep Neural Networks
TLDR
The proposed Trained Rank Pruning (TRP), which alternates between low rank approximation and training, is proposed, which maintains the capacity of the original network while imposing low-rank constraints during training. Expand
Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification
  • Huanrui Yang, Minxue Tang, +5 authors Yiran Chen
  • Computer Science, Mathematics
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
  • 2020
TLDR
SVD training is proposed, the first method to explicitly achieve low-rank DNNs during training without applying SVD on every step, and empirically shows that SVD training can significantly reduce the rank of DNN layers and achieve higher reduction on computation load under the same accuracy. Expand
Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer
TLDR
It is shown that this indeed can select ranks much better than existing approaches, making low-rank compression much more attractive than previously thought, and can make a VGG network faster than a ResNet and with nearly the same classification error. Expand
Compression-aware Continual Learning using Singular Value Decomposition
TLDR
This work employs compression-aware training and performs low-rank weight approximations using singular value decomposition (SVD) to achieve network compaction and introduces a novel shared representational space based learning between tasks. Expand
Pruning by Training: A Novel Deep Neural Network Compression Framework for Image Processing
TLDR
This work proposes an effective one-stage pruning framework: introducing a trainable collaborative layer to jointly prune and learn neural networks in one go, and demonstrates very promising results against other state-of-the-art filter pruning methods. Expand
Scalable Deep Neural Networks via Low-Rank Matrix Factorization
TLDR
A novel method is proposed that enables DNNs to flexibly change their size after training via singular value decomposition (SVD), which enables to effectively compress the error and complexity of models as little as possible. Expand
Neural Network Compression via Additive Combination of Reshaped, Low-Rank Matrices
TLDR
This work considers a form of network compression that has not been explored before: an additive combination of reshaped low-rank matrices, which results in a “Learning-Compression” algorithm which alternates between a standard machine learning step and a step involving signal compression. Expand
Optimal Selection of Matrix Shape and Decomposition Scheme for Neural Network Compression
TLDR
The algorithm automatically selects the most suitable ranks and decomposition schemes to efficiently reduce compression costs (e.g., FLOPs) of various networks. Expand
Principal Component Networks: Parameter Reduction Early in Training
TLDR
This paper shows how to find small networks that exhibit the same performance as their overparameterized counterparts after only a few training epochs and uses PCA to find a basis of high variance for layer inputs and represent layer weights using these directions. Expand
Refining the Structure of Neural Networks Using Matrix Conditioning
TLDR
This work proposes a practical method that employs matrix conditioning to automatically design the structure of layers of a feed-forward network, by first adjusting the proportion of neurons among theayers of a network and then scaling the size of network up or down. Expand
...
1
2
...

References

SHOWING 1-10 OF 40 REFERENCES
Coordinating Filters for Faster Deep Neural Networks
TLDR
Force Regularization, which uses attractive forces to enforce filters so as to coordinate more weight information into lower-rank space, is proposed and mathematically and empirically verified that after applying this technique, standard LRA methods can reconstruct filters using much lower basis and thus result in faster DNNs. Expand
Constrained Optimization Based Low-Rank Approximation of Deep Neural Networks
TLDR
COBLA is empirically demonstrate that COBLA outperforms prior art using the SqueezeNet and VGG-16 architecture on the ImageNet dataset and is approximately solved by sequential quadratic programming. Expand
Learning Structured Sparsity in Deep Neural Networks
TLDR
The results show that for CIFAR-10, regularization on layer depth can reduce 20 layers of a Deep Residual Network to 18 layers while improve the accuracy from 91.25% to 92.60%, which is still slightly higher than that of original ResNet with 32 layers. Expand
Convolutional neural networks with low-rank regularization
TLDR
A new algorithm for computing the low-rank tensor decomposition for removing the redundancy in the convolution kernels and is more effective than iterative methods for speeding up large CNNs. Expand
Training Quantized Nets: A Deeper Understanding
TLDR
This work investigates training methods for quantized neural networks from a theoretical viewpoint, and explores accuracy guarantees for training methods under convexity assumptions, and shows that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low- Precision arithmetic. Expand
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression
TLDR
ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods. Expand
Wide Residual Networks
TLDR
This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts. Expand
Pruning Filters for Efficient ConvNets
TLDR
This work presents an acceleration method for CNNs, where it is shown that even simple filter pruning techniques can reduce inference costs for VGG-16 and ResNet-110 by up to 38% on CIFAR10 while regaining close to the original accuracy by retraining the networks. Expand
Compression-aware Training of Deep Networks
TLDR
It is shown that accounting for compression during training allows us to learn much more compact, yet at least as effective, models than state-of-the-art compression techniques. Expand
Speeding up Convolutional Neural Networks with Low Rank Expansions
TLDR
Two simple schemes for drastically speeding up convolutional neural networks are presented, achieved by exploiting cross-channel or filter redundancy to construct a low rank basis of filters that are rank-1 in the spatial domain. Expand
...
1
2
3
4
...