ELASTIC: Improving CNNs with Instance Specific Scaling Policies
@article{Wang2018ELASTICIC, title={ELASTIC: Improving CNNs with Instance Specific Scaling Policies}, author={Huiyu Wang and Aniruddha Kembhavi and Ali Farhadi and Alan Loddon Yuille and Mohammad Rastegari}, journal={ArXiv}, year={2018}, volume={abs/1812.05262} }
Scale variation has been a challenge from traditional to modern approaches in computer vision. Most solutions to scale issues have similar theme: a set of intuitive and manually designed policies that are generic and fixed (e.g. SIFT or feature pyramid). We argue that the scale policy should be learned from data. In this paper, we introduce ELASTIC, a simple, efficient and yet very effective approach to learn instance-specific scale policy from data. We formulate the scaling policy as a non…
No Paper Link Available
Figures and Tables from this paper
5 Citations
Multi-Dimensional Pruning: A Unified Framework for Model Compression
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
This work proposes a unified model compression framework called Multi-Dimensional Pruning (MDP) to simultaneously compress the convolutional neural networks (CNNs) on multiple dimensions and demonstrates that the MDP framework outperforms the existing methods when pruning both 2D and 3D CNNs.
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes to factorize the mixed feature maps by their frequencies, and design a novel Octave Convolution (OctConv) operation to store and process feature maps that vary spatially “slower” at a lower spatial resolution reducing both memory and computation cost.
DiCENet: Dimension-Wise Convolutions for Efficient Networks
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2022
A novel and generic convolutional unit that is built using dimension-wise convolutions anddimension-wise fusion, that shows significant improvements over state-of-the-art models across various computer vision tasks including image classification, object detection, and semantic segmentation.
Exploring Multi-Scale Feature Propagation and Communication for Image Super Resolution
- Computer ScienceArXiv
- 2020
This work presents a unified formulation over widely-used multi-scale structures -- Multi-Scale cross-Scale Share-weights convolution (MS$^3$-Conv), which can achieve better SR performance than the standard convolution with less parameters and computational cost.
⨯ EFuse : E fficient channel Fus ion Operations # Params Notations ∗ Convolution × Element-wise multiplication = Concatenate H : Height W : Width D : Depth n : Kernel size DimConv : Dimension-wise Convolution Weighted Average Channel-wise ∗
- 2019
40 References
Feature Pyramid Networks for Object Detection
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.
Aggregated Residual Transformations for Deep Neural Networks
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity.
ParseNet: Looking Wider to See Better
- Computer ScienceArXiv
- 2015
This work presents a technique for adding global context to deep convolutional networks for semantic segmentation, and achieves state-of-the-art performance on SiftFlow and PASCAL-Context with small additional computational cost over baselines.
Multilabel Image Classification With Regional Latent Semantic Dependencies
- Computer ScienceIEEE Transactions on Multimedia
- 2018
The proposed RLSD achieves the best performance compared to the state-of-the-art models, especially for predicting small objects occurring in the images, and can approach the upper bound without using the bounding-box annotations, which is more realistic in the real world.
Fully convolutional networks for semantic segmentation
- Computer Science2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2015
The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
- Computer ScienceECCV
- 2018
A fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints, which outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and the newly introduced performance metrics that measure efficiency on edge devices.
Rethinking Atrous Convolution for Semantic Image Segmentation
- Computer ScienceArXiv
- 2017
The proposed `DeepLabv3' system significantly improves over the previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark.
Improving Pairwise Ranking for Multi-label Image Classification
- Computer Science2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2017
A novel loss function for pairwise ranking is proposed, which is smooth everywhere, and a label decision module is incorporated into the model, estimating the optimal confidence thresholds for each visual concept.
Rethinking the Inception Architecture for Computer Vision
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.
Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation
- Computer Science2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
- 2016
This work introduces another model - dubbed Recombinator Networks - where coarse features inform finer features early in their formation such that finer features can make use of several layers of computation in deciding how to use coarse features.