On Compressing U-net Using Knowledge Distillation
@article{Mangalam2018OnCU, title={On Compressing U-net Using Knowledge Distillation}, author={Karttikeya Mangalam and Mathieu Salzamann}, journal={ArXiv}, year={2018}, volume={abs/1812.00249} }
We study the use of knowledge distillation to compress the U-net architecture. We show that, while standard distillation is not sufficient to reliably train a compressed U-net, introducing other regularization methods, such as batch normalization and class re-weighting, in knowledge distillation significantly improves the training process. This allows us to compress a U-net by over 1000x, i.e., to 0.1% of its original number of parameters, at a negligible decrease in performance.
9 Citations
Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2022
This paper provides a comprehensive survey on the recent progress of KD methods together with S-T frameworks typically used for vision tasks and systematically analyzes the research status of KD in vision applications.
DNetUnet: a semi-supervised CNN of medical image segmentation for super-computing AI service
- Computer ScienceThe Journal of Supercomputing
- 2020
A new convolutional neural network architecture named DNetUnet is proposed, which combines U-Nets with different down-sampling levels and a new dense block as feature extractor and is a semi-supervised learning method, which can be used not only to obtain expert knowledge from the labelled corpus, but also to enhance the performance of learning algorithm generalization ability from unlabelled data.
Towards End-to-End Deep Learning-based Writer Identification
- Computer ScienceGI-Jahrestagung
- 2020
A fully end-to-end deep learning-based model, which consists of a U-Net for binarization, a ResNet-50 for feature extraction, and an optimized learnable residual encoding layer to obtain global descriptors is proposed.
Obesity May Be Bad: Compressed Convolutional Networks for Biomedical Image Segmentation
- Computer Science
- 2020
The Optimum Mimic Backbone (OMB) is introduced, which can force compressed CNN mimics what the original CNN behaves in optimal situations and gets higher IoU scores than other state-of-the-art compression techniques in experiments on four popular, different biomedical image segmentation datasets.
Low-Memory CNNs Enabling Real-Time Ultrasound Segmentation Towards Mobile Deployment
- Computer ScienceIEEE Journal of Biomedical and Health Informatics
- 2020
This article demonstrates the power of ‘thin’ CNNs (with very few feature channels) for fast medical image segmentation, and proposes three approaches to training efficient CNNs that can operate in real-time on a CPU, with a low memory footprint, for minimal compromise in accuracy.
Deep Learning based Intraretinal Layer Segmentation using Cascaded Compressed U-Net
- Computer Science, MedicinemedRxiv
- 2021
This work proposes a cascaded two-stage network for intraretinal layer segmentation, with both networks being compressed versions of U-Net (CCU-INSEG), and introduces Laplacian-based outlier detection with layer surface hole filling by adaptive non-linear interpolation at the post-processing stage.
Segmentation of roots in soil with U-Net
- Computer SciencePlant Methods
- 2020
The feasibility of a U-Net based CNN system for segmenting images of roots in soil and for replacing the manual line-intersect method is demonstrated, demonstrating the feasibility of deep learning in practice for small research groups needing to create their own custom labelled dataset from scratch.
Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report
- Computer Science2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
- 2021
This paper introduces the first Mobile AI challenge, where the target is to develop an end-to-end deep learning-based depth estimation solutions that can demonstrate a nearly realtime performance on smartphones and IoT platforms.
Intraretinal Layer Segmentation Using Cascaded Compressed U-Nets
- MedicineJ. Imaging
- 2022
The proposed and validate a cascaded two-stage network for intraretinal layer segmentation with both networks being compressed versions of U-Net (CCU-INSEG) suggest that the proposed method can robustly segment macular scans from eyes with even severe neuroretinal changes.
References
SHOWING 1-10 OF 25 REFERENCES
Compression-aware Training of Deep Networks
- Computer ScienceNIPS
- 2017
It is shown that accounting for compression during training allows us to learn much more compact, yet at least as effective, models than state-of-the-art compression techniques.
Learning Efficient Object Detection Models with Knowledge Distillation
- Computer ScienceNIPS
- 2017
This work proposes a new framework to learn compact and fast object detection networks with improved accuracy using knowledge distillation and hint learning and shows consistent improvement in accuracy-speed trade-offs for modern multi-class detection models.
Model compression
- Computer ScienceKDD '06
- 2006
This work presents a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance.
Distilling the Knowledge in a Neural Network
- Computer ScienceArXiv
- 2015
This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
Learning both Weights and Connections for Efficient Neural Network
- Computer ScienceNIPS
- 2015
A method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections, and prunes redundant connections using a three-step method.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
- Computer ScienceICML
- 2015
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Stacked Hourglass Networks for Human Pose Estimation
- Computer ScienceECCV
- 2016
This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.
Model Distillation with Knowledge Transfer in Face Classification, Alignment and Verification
- Computer ScienceArXiv
- 2017
This paper takes face recognition as a breaking point and proposes model distillation with knowledge transfer from face classification to alignment and verification, and uses a common initialization trick to improve the distillation performance of classification.
Face Model Compression by Distilling Knowledge from Neurons
- Computer ScienceAAAI
- 2016
This work addresses model compression for face recognition, where the learned knowledge of a large teachernetwork or its ensemble is utilized as supervision to train a compact student network by leveraging the essential characteristics of thelearned face representation.
Very Deep Convolutional Networks for Large-Scale Image Recognition
- Computer ScienceICLR
- 2015
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.