Joint Matrix Decomposition for Deep Convolutional Neural Networks Compression

  title={Joint Matrix Decomposition for Deep Convolutional Neural Networks Compression},
  author={Shaowu Chen and Jihao Zhou and Weize Sun and Lei Huang},

Hybrid Model Compression for Multi-Task Network

A hybrid joint-network optimization model is proposed to solve this multi-task multi-model compression problem, which turns the sub-network of the same structure of multiple models into a hybrid network model, in which the same parameters are shared for all the tasks.

A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration

To improve the efficiency of deep learning research, this review focuses on three aspects: quantized/binarized models, optimized architectures, and resource-constrained systems.

Sub-network Multi-objective Evolutionary Algorithm for Filter Pruning

A multi-objective optimization problem based on a sub-network of the full model and a Sub-network Multiobjective Evolutionary Algorithm (SMOEA) for filter pruning, which can obtain a lightweight pruned result with better performance.



Deep Convolutional Neural Network Compression via Coupled Tensor Decomposition

A simultaneous tensor decomposition technique for network optimization that can obtain a better network performance under the same compression ratio and restore the performance of the neural network model by fine-tuning.

Convolutional neural networks with low-rank regularization

A new algorithm for computing the low-rank tensor decomposition for removing the redundancy in the convolution kernels and is more effective than iterative methods for speeding up large CNNs.

Hybrid Tensor Decomposition in Neural Network Compression

Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications

A simple and effective scheme to compress the entire CNN, called one-shot whole network compression, which addresses the important implementation level issue on 1?1 convolution, which is a key operation of inception module of GoogLeNet as well as CNNs compressed by the proposed scheme.

Sparse low rank factorization for deep neural network compression

Wide Compression: Tensor Ring Nets

This work introduces Tensor Ring Networks (TR-Nets), which significantly compress both the fully connected layers and the convolutional layers of deep neural networks, and shows promise in scientific computing and deep learning, especially for emerging resource-constrained devices such as smartphones, wearables and IoT devices.

Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

A simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning is proposed, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks.

Progressive principle component analysis for compressing deep convolutional neural networks

Network Decoupling: From Regular to Depthwise Separable Convolutions

Network decoupling (ND) is proposed, a training-free method to accelerate convolutional neural networks (CNNs) by transferring pre-trained CNN models into the MobileNet-like depthwise separable convolution structure, with a promising speedup yet negligible accuracy loss.

On Compressing Deep Models by Low Rank and Sparse Decomposition

A unified framework integrating the low-rank and sparse decomposition of weight matrices with the feature map reconstructions is proposed, which can significantly reduce the parameters for both convolutional and fully-connected layers.