Corpus ID: 4402066

Iterative Low-Rank Approximation for CNN Compression

  title={Iterative Low-Rank Approximation for CNN Compression},
  author={Maksym Kholiavchenko},
Deep convolutional neural networks contain tens of millions of parameters, making them impossible to work efficiently on embedded devices. We propose iterative approach of applying low-rank approximation to compress deep convolutional neural networks. Since classification and object detection are the most favored tasks for embedded devices, we demonstrate the effectiveness of our approach by compressing AlexNet, VGG-16, YOLOv2 and Tiny YOLO networks. Our results show the superiority of the… Expand
Knowledge Extraction with No Observable Data
KegNet (Knowledge Extraction with Generative Networks), a novel approach to extract the knowledge of a trained deep neural network and to generate artificial data points that replace the missing training data in knowledge distillation is proposed. Expand
On the Redundancy in the Rank of Neural Network Parameters and Its Controllability
A novel regularization method that is a combination of an objective function that makes the parameter rank-deficient and a dynamic low-rank factorization algorithm that gradually reduces the size of this parameter by fusing linearly dependent vectors together leads to a neural network with better training dynamics and fewer trainable parameters. Expand
Quantum Higher Order Singular Value Decomposition
This paper presents a quantum algorithm for higher order singular value decomposition that allows one to decompose a tensor into a core tensor containing tensor singular values and some unitary matrices by quantum computers. Expand


Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications
A simple and effective scheme to compress the entire CNN, called one-shot whole network compression, which addresses the important implementation level issue on 1?1 convolution, which is a key operation of inception module of GoogLeNet as well as CNNs compressed by the proposed scheme. Expand
Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation
Using large state-of-the-art models, this work demonstrates speedups of convolutional layers on both CPU and GPU by a factor of 2 x, while keeping the accuracy within 1% of the original model. Expand
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual RecognitionExpand
Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding
This work introduces "deep compression", a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy. Expand
Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition
A simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning is proposed, leading to higher obtained CPU speedups at the cost of lower accuracy drops for the smaller of the two networks. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
Convolutional Neural Networks using Logarithmic Data Representation
This paper proposes a new data representation that enables state-of-the-art networks to be encoded to 3 bits with negligible loss in classification performance, and proposes an end-to-end training procedure that uses log representation at 5-bits, which achieves higher final test accuracy than linear at5-bits. Expand
Fixed Point Quantization of Deep Convolutional Networks
This paper proposes a quantizer design for fixed point implementation of DCNs, formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers, and demonstrates that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. Expand
ImageNet classification with deep convolutional neural networks
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand
Weighted-Entropy-Based Quantization for Deep Neural Networks
This paper proposes a novel method for quantizing weights and activations based on the concept of weighted entropy, which achieves significant reductions in both the model size and the amount of computation with minimal accuracy loss. Expand