• Corpus ID: 239009677

Non-deep Networks

  title={Non-deep Networks},
  author={Ankit Goyal and Alexey Bochkovskiy and Jia Deng and Vladlen Koltun},
Depth is the hallmark of deep neural networks. But more depth means more sequential computation and higher latency. This begs the question – is it possible to build high-performing “non-deep” neural networks? We show that it is. To do so, we use parallel subnetworks instead of stacking one layer after another. This helps effectively reduce depth while maintaining high performance. By utilizing parallel substructures, we show, for the first time, that a network with a depth of just 12 can… 

Figures and Tables from this paper

A Lightweight Image Entropy-Based Divide-and-Conquer Network for Low-Light Image Enhancement

A lightweight image entropy-based divide-and-conquer network called IEDCN for low-light image enhancement with only 0.73M parameters can effectively improve the quality of enhanced images, while saving up to 53% Flops on the LOL dataset.

Temporal Self Attention-Based Residual Network for Environmental Sound Classification

A linear self-attention (LSA) mechanism with a learnable memory unit that encodes temporal and spectral characteristics of the spectrogram used while training the deep ESC model that is comparable or superior to state-of-the-art attention-based deep ESC models.

Multi-Scale Safety Helmet Detection Based on RSSE-YOLOv3

The improved algorithm improves the precision, recall, and average precision of the YOLOv3 algorithm by 3.9%, the recall by 5.2%, and the average precision by 4.7%, which significantly improves the performance of the detection.

Investigation of Performance of Visual Attention Mechanisms for Environmental Sound Classification: A Comparative Study

The performance of deep network is investigated when twelve SOTA visual attention mechanisms are incorporated in the training of the deep network on two benchmark environmental sound classification datasets.

OSO-YOLOv5: Automatic Extraction Method of Store Signboards in Street View Images Based on Multi-Dimensional Analysis

The proposed OSO-YOLOv5 network integrates location attention and topology reconstruction, realizes automatic extraction of information from store signboards, improves computational efficiency, and effectively suppresses the effect of occlusion.

Combining CNN and MLP for Plant Pathology Recognition in Natural Scenes

This paper proposes a parallel architecture model that combines CNN’s excellent inductive bias ability with MLP‘s salient ability to obtain the global features to minimize the attention on the scene.



Wide Residual Networks

This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts.

Deep Networks with Stochastic Depth

Stochastic depth is proposed, a training procedure that enables the seemingly contradictory setup to train short networks and use deep networks at test time and reduces training time substantially and improves the test error significantly on almost all data sets that were used for evaluation.

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

A new scaling method is proposed that uniformly scales all dimensions of depth/width/resolution using a simple yet highly effective compound coefficient and is demonstrated the effectiveness of this method on scaling up MobileNets and ResNet.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Densely Connected Convolutional Networks

The Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion, and has several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.

Scaled-YOLOv4: Scaling Cross Stage Partial Network

We show that the YOLOv4 object detection neural network based on the CSP approach, scales both up and down and is applicable to small and large networks while maintaining optimal speed and accuracy.

Squeeze-and-Excitation Networks

This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets.

Rethinking the Inception Architecture for Computer Vision

This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Feature Pyramid Networks for Object Detection

This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.