SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size
- Forrest N. Iandola, M. Moskewicz, Khalid Ashraf, Song Han, W. Dally, K. Keutzer
- Computer ScienceArXiv
- 24 February 2016
This work proposes a small DNN architecture called SqueezeNet, which achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters and is able to compress to less than 0.5MB (510x smaller than AlexNet).
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
- Bichen Wu, Xiaoliang Dai, K. Keutzer
- Computer ScienceComputer Vision and Pattern Recognition
- 9 December 2018
This work proposes a differentiable neural architecture search (DNAS) framework that uses gradient-based methods to optimize ConvNet architectures, avoiding enumerating and training individual architectures separately as in previous methods.
The Landscape of Parallel Computing Research: A View from Berkeley
- K. Asanović, R. Bodík, K. Yelick
- Computer Science
- 18 December 2006
The parallel landscape is frame with seven questions, and the following are recommended to explore the design space rapidly: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS each development dollar.
DenseNet: Implementing Efficient ConvNet Descriptor Pyramids
- Forrest N. Iandola, M. Moskewicz, Sergey Karayev, Ross B. Girshick, Trevor Darrell, K. Keutzer
- Computer ScienceArXiv
- 7 April 2014
DenseNet is presented, an open source system that computes dense, multiscale features from the convolutional layers of a CNN based object classifier.
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
- Yang You, Jing Li, Cho-Jui Hsieh
- Computer ScienceInternational Conference on Learning…
- 1 April 2019
The empirical results demonstrate the superior performance of LAMB across various tasks such as BERT and ResNet-50 training with very little hyperparameter tuning, and the optimizer enables use of very large batch sizes of 32868 without any degradation of performance.
- LEVEL ACCURACY WITH 50 X FEWER PARAMETERS AND < 0 . 5 MB MODEL SIZE
- Forrest N. Iandola, Song Han, M. Moskewicz, Khalid Ashraf, W. Dally, K. Keutzer
- Computer Science
- 2016
A small CNN architecture called SqueezeNet is proposed, which achieves AlexNet-level accuracy on ImageNet with 50x fewer parameters and is able to compress to less than 0.5MB (510× smaller than AlexNet).
SqueezeSeg: Convolutional Neural Nets with Recurrent CRF for Real-Time Road-Object Segmentation from 3D LiDAR Point Cloud
- Bichen Wu, Alvin Wan, Xiangyu Yue, K. Keutzer
- Computer ScienceIEEE International Conference on Robotics and…
- 19 October 2017
An end-to-end pipeline called SqueezeSeg based on convolutional neural networks (CNN), which takes a transformed LiDAR point cloud as input and directly outputs a point-wise label map, which is then refined by a conditional random field (CRF) implemented as a recurrent layer.
SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud
- Bichen Wu, Xuanyu Zhou, Sicheng Zhao, Xiangyu Yue, K. Keutzer
- Computer ScienceIEEE International Conference on Robotics and…
- 22 September 2018
This work introduces a new model SqueezeSegV2, which is more robust against dropout noises in LiDAR point cloud and therefore achieves significant accuracy improvement, and a domain-adaptation training pipeline consisting of three major components: learned intensity rendering, geodesic correlation alignment, and progressive domain calibration.
ZeroQ: A Novel Zero Shot Quantization Framework
- Yaohui Cai, Z. Yao, Zhen Dong, A. Gholami, M. Mahoney, K. Keutzer
- Computer ScienceComputer Vision and Pattern Recognition
- 2 January 2020
THE AUTHORS' enables mixed-precision quantization without any access to the training or validation data, and it can finish the entire quantization process in less than 30s, which is very low computational overhead.
SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving
- Bichen Wu, Forrest N. Iandola, Peter H. Jin, K. Keutzer
- Computer ScienceIEEE Conference on Computer Vision and Pattern…
- 4 December 2016
SqueezeDet is a fully convolutional neural network for object detection that aims to simultaneously satisfy all of the above constraints, and is very accurate, achieving state-of-the-art accuracy on the KITTI benchmark.
...
...