DiCENet: Dimension-wise Convolutions for Efficient Networks

  title={DiCENet: Dimension-wise Convolutions for Efficient Networks},
  author={Sachin Mehta and Hannaneh Hajishirzi and Mohammad Rastegari},
  journal={IEEE transactions on pattern analysis and machine intelligence},
We introduce a novel and generic convolutional unit, DiCE unit, that is built using dimension-wise convolutions and dimension-wise fusion. The dimension-wise convolutions apply lightweight convolutional filtering across each dimension of the input tensor while dimension-wise fusion efficiently combines these dimension-wise representations; allowing the DiCE unit to efficiently encode spatial and channel-wise information contained in the input tensor. The DiCE unit is simple and can be… Expand
AksharaNet: A GPU Accelerated Modified Depth-Wise Separable Convolution for Kannada Text Classification
Two key contributions are proposed, AksharaNet, a graphical processing unit (GPU) accelerated modified convolution neural network architecture consisting of linearly inverted depth-wise separable convolutions and a Kannada Scene Individual Character (KSIC) dataset which is groundsup curated consisting of 46,800 images. Expand
Efficient and robust deep learning architecture for segmentation of kidney and breast histopathology images
A deep learning model that automatically segments the complex nuclei present in histopathology images by implementing an effective encoder–decoder architecture with a separable convolution pyramid pooling network (SCPP-Net). Expand
An approach to improve SSD through mask prediction of multi-scale feature maps
  • Peng Sun, Yaqin Zhao, Songhao Zhu
  • Computer Science
  • Pattern Anal. Appl.
  • 2021
A novel single shot object detection network with a mask prediction branch to enhance object detection features with semantic information extracted from deeper layers and an improved receptive field block is adopted to increase the scale of receptive field of backbone network without too much extra computing burden. Expand
RRNet: Repetition-Reduction Network for Energy Efficient Depth Estimation
A Repetition-Reduction Network (RRNet) in which the number of depthwise channels is largeenough to reduce computation time while simultaneously being small enough to reduce GPU latency, and which outperforms state-of-the-art lightweight models such as MobileNets, PyDNet, DiCENet, DABNet, and EfficientNet. Expand
A Survey on Deep Domain Adaptation and Tiny Object Detection Challenges, Techniques and Datasets
This survey paper specially analyzed computer vision-based object detection challenges and solutions by different techniques and showed future directions with existing challenges of the field. Expand
GNN-RL Compression: Topology-Aware Network Pruning using Multi-stage Graph Embedding and Reinforcement Learning
A novel multi-stage graph embedding technique based on graph neural networks (GNNs) to identify the DNNs’ topology and use reinforcement learning (RL) to find a suitable compression policy that outperformed state-of-the-art methods and achieved a higher accuracy. Expand
MiNet: Efficient Deep Learning Automatic Target Recognition for Small Autonomous Vehicles
MiNet was successfully deployed onboard small OceanServer Iver3 autonomous underwater vehicles during the REBOOT sea trial and predicted the latitude, longitude, and class of objects detected in sonar images within minutes of the completion of each mission leg. Expand
An Automatic Detection Algorithm of Metro Passenger Boarding and Alighting Based on Deep Learning and Optical Flow
A metro passenger detection algorithm to track passengers getting on or off, namely the MPD algorithm, composed of two MetroNexts and an optical flow algorithm that robustly recognizes target passengers, the detection speed of which remains competitive even on the embedded platform, thus becoming an intelligent instrument of the metro station. Expand
Exposing Semantic Segmentation Failures via Maximum Discrepancy Competition
This paper exposes failures of existing semantic segmentation methods in the open visual world under the constraint of very limited human labeling effort, and conducts a thorough MAD diagnosis of ten PASCAL VOC semantic segmentsation algorithms. Expand
Weather-Aware Long-Range Traffic Forecast Using Multi-Module Deep Neural Network
This study proposes a novel multi-module deep neural network framework which aims at improving intelligent long-term traffic forecasting. Following our previous system, the internal architecture ofExpand


Squeeze-and-Excitation Networks
  • Jie Hu, Li Shen, Gang Sun
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
This work proposes a novel architectural unit, which is term the "Squeeze-and-Excitation" (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and finds that SE blocks produce significant performance improvements for existing state-of-the-art deep architectures at minimal additional computational cost. Expand
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
A fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints, which outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and the newly introduced performance metrics that measure efficiency on edge devices. Expand
ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network
We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual and sequential data. Our network uses group point-wise and depth-wiseExpand
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
This work addresses the task of semantic image segmentation with Deep Learning and proposes atrous spatial pyramid pooling (ASPP), which is proposed to robustly segment objects at multiple scales, and improves the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. Expand
Rethinking Atrous Convolution for Semantic Image Segmentation
The proposed `DeepLabv3' system significantly improves over the previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark. Expand
Aggregated Residual Transformations for Deep Neural Networks
On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity. Expand
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures, including FCN and DeconvNet. Expand
ELASTIC: Improving CNNs with Instance Specific Scaling Policies
This paper introduces ELASTIC, a simple, efficient and yet very effective approach to learn instance-specific scale policy from data, and forms the scaling policy as a non-linear function inside the network’s structure that is learned from data. Expand
Xception: Deep Learning with Depthwise Separable Convolutions
  • François Chollet
  • Computer Science, Mathematics
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
This work proposes a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions, and shows that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset, and significantly outperforms it on a larger image classification dataset. Expand
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet. Expand