• Corpus ID: 14731791

Spatially-sparse convolutional neural networks

  title={Spatially-sparse convolutional neural networks},
  author={Benjamin Graham},
Convolutional neural networks (CNNs) perform well on problems such as handwriting recognition and image classification. [] Key Result Applying a deep convolutional network using sparsity has resulted in a substantial reduction in test error on the CIFAR small picture datasets: 6.28% on CIFAR-10 and 24.30% for CIFAR-100.

Figures and Tables from this paper

Quadtree Convolutional Neural Networks

The method decomposes and represents the image as a linear quadtree that is only refined in the non-empty portions of the image, enabling QCNN to learn from sparse images much faster and process high resolution images without the memory constraints faced by traditional CNNs.

Sparsity Invariant CNNs

This paper proposes a simple yet effective sparse convolution layer which explicitly considers the location of missing data during the convolution operation, and demonstrates the benefits of the proposed network architecture in synthetic and real experiments with respect to various baseline approaches.

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs

This work introduces a suite of tools that exploit sparsity in both the feature maps and the filter weights of a CNN, and thereby allow for significantly lower memory footprints and computation times than the conventional dense framework, when processing data with a high degree of sparsity.

Coupled-learning convolutional neural networks for object recognition

Comprehensive evaluations on five benchmark datasets well demonstrate the significant superiority of the proposed Co-CNN framework over other existing algorithms.

Refining Architectures of Deep Convolutional Neural Networks

This paper introduces a novel strategy that alters the architecture of a given CNN for a specified dataset, to potentially enhance the original accuracy while possibly reducing the model size.

Hypercolumn Sparsification for Low-Power Convolutional Neural Networks

It is shown that hypercolumn sparsification could lead to more data-efficient learning as well as having an emergent property of significantly pruning down the number of connections in the network, thereby making pattern recognition amenable for energy-efficient hardware implementations.

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

This work creates an open-source auto-differentiation library for sparse tensors that provides extensive functions for high-dimensional convolutional neural networks and proposes the hybrid kernel, a special case of the generalized sparse convolution, and trilateral-stationary conditional random fields that enforce spatio-temporal consistency in the 7D space-time-chroma space.

PreNet: Parallel Recurrent Neural Networks for Image Classification

This paper proposes a hierarchical parallel recurrent neural network (PreNet) to model spatial context for image classification and can achieve the state-of-the-art classification performance, which demonstrates the advantage of PreNet over many comparative CNN structures.

Acorns: A Framework for Accelerating Deep Neural Networks with Input Sparsity

This paper proposes Acorns, a framework to accelerate deep neural networks with input sparsity that generates efficient sparse kernels for operators in neural networks from kernel templates, which combine directions that express specific optimizing transformations to be performed, and straightforward code that describes the computation.

Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference

An efficient CUDA implementation of the dynamic convolutions conditioned on the input image using a gather-scatter approach is provided, achieving a significant improvement in inference speed on MobileNetV2 and ShuffleNet V2.



Multi-column deep neural networks for image classification

On the very competitive MNIST handwriting benchmark, this method is the first to achieve near-human performance and improves the state-of-the-art on a plethora of common image classification benchmarks.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Network In Network

With enhanced local modeling via the micro network, the proposed deep network structure NIN is able to utilize global average pooling over feature maps in the classification layer, which is easier to interpret and less prone to overfitting than traditional fully connected layers.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

3D Convolutional Neural Networks for Human Action Recognition

A novel 3D CNN model for action recognition that extracts features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames.

Learning methods for generic object recognition with invariance to pose and lighting

  • Yann LeCunF. HuangL. Bottou
  • Computer Science
    Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004.
  • 2004
A real-time version of the system was implemented that can detect and classify objects in natural scenes at around 10 frames per second and proved impractical, while convolutional nets yielded 16/7% error.

Improving neural networks by preventing co-adaptation of feature detectors

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the

Word-level training of a handwritten word recognizer based on convolutional neural networks

  • Yann LeCunYoshua Bengio
  • Computer Science
    Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5)
  • 1994
A new approach for online recognition of handwritten words written in unconstrained mixed style where each pixel contains information about trajectory direction and curvature is introduced.

Rectifier Nonlinearities Improve Neural Network Acoustic Models

This work explores the use of deep rectifier networks as acoustic models for the 300 hour Switchboard conversational speech recognition task, and analyzes hidden layer representations to quantify differences in how ReL units encode inputs as compared to sigmoidal units.