• Corpus ID: 244920641

Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization

  title={Low-rank Tensor Decomposition for Compression of Convolutional Neural Networks Using Funnel Regularization},
  author={Bo-Shiuan Chu and Che-Rung Lee},
Tensor decomposition is one of the fundamental technique for model compression of deep convolution neural networks owing to its ability to reveal the latent relations among complex structures. However, most existing methods compress the networks layer by layer, which cannot provide a satisfactory solution to achieve global optimization. In this paper, we proposed a model reduction method to compress the pre-trained networks using low-rank tensor decomposition of the convolution layers. Our… 

Figures and Tables from this paper

Nested compression of convolutional neural networks with Tucker-2 decomposition

  • R. ZdunekM. Gábor
  • Computer Science
    2022 International Joint Conference on Neural Networks (IJCNN)
  • 2022
This study presents a novel concept for compressing neural networks using nested low-rank decomposition methods and shows that using the nested compression, it can achieve much higher parameter and FLOPS compression with a minor drop in classification accuracy.

Towards Green AI with tensor networks - Sustainability and innovation enabled by efficient algorithms

This paper presents a promising tool for sustainable and thus Green AI: tensor networks (TNs), an established tool from multilinear algebra that has the capability to improve efficiency without compromising accuracy, and argues that better algorithms should be evaluated in terms of both accuracy and ef-fiency.



Stable Low-rank Tensor Decomposition for Compression of Convolutional Neural Network

This paper presents a novel method, which can stabilize the low-rank approximation of convolutional kernels and ensure efficient compression while preserving the high-quality performance of the neural networks.

Convolutional neural networks with low-rank regularization

A new algorithm for computing the low-rank tensor decomposition for removing the redundancy in the convolution kernels and is more effective than iterative methods for speeding up large CNNs.

Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

A simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning is proposed, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks.

CP-decomposition with Tensor Power Method for Convolutional Neural Networks compression

  • M. AstridSeung-Ik Lee
  • Computer Science
    2017 IEEE International Conference on Big Data and Smart Computing (BigComp)
  • 2017
A CNN compression method based on CP-decomposition and Tensor Power Method and an iterative fine tuning, with which the whole network is fine-tuned after decomposing each layer, but before decomposing the next layer, are proposed.

Compressing Deep Convolutional Networks using Vector Quantization

This paper is able to achieve 16-24 times compression of the network with only 1% loss of classification accuracy using the state-of-the-art CNN, and finds in terms of compressing the most storage demanding dense connected layers, vector quantization methods have a clear gain over existing matrix factorization methods.

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

ThiNet is proposed, an efficient and unified framework to simultaneously accelerate and compress CNN models in both training and inference stages, and it is revealed that it needs to prune filters based on statistics information computed from its next layer, not the current layer, which differentiates ThiNet from existing methods.

A Survey of Model Compression and Acceleration for Deep Neural Networks

This paper survey the recent advanced techniques for compacting and accelerating CNNs model developed, roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation.

Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications

A simple and effective scheme to compress the entire CNN, called one-shot whole network compression, which addresses the important implementation level issue on 1?1 convolution, which is a key operation of inception module of GoogLeNet as well as CNNs compressed by the proposed scheme.

Automated Multi-Stage Compression of Neural Networks

This work proposes a new simple and efficient iterative approach, which alternates low-rank factorization with smart rank selection and fine-tuning and improves the compression rate while maintaining the accuracy for a variety of tasks.

Channel Pruning for Accelerating Very Deep Neural Networks

  • Yihui HeXiangyu ZhangJian Sun
  • Computer Science
    2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
This paper proposes an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction, and generalizes this algorithm to multi-layer and multi-branch cases.