Going deeper with convolutions

  title={Going deeper with convolutions},
  author={Christian Szegedy and Wei Liu and Yangqing Jia and Pierre Sermanet and Scott E. Reed and Dragomir Anguelov and D. Erhan and Vincent Vanhoucke and Andrew Rabinovich},
  journal={2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the… 

Deep Convolution Neural Networks in Computer Vision: a Review

This review paper is focusing on techniques directly related to DCNNs, especially those needed to understand the architecture and techniques employed in GoogLeNet network.

Fusing Deep Convolutional Networks for Large Scale Visual Concept Classification

  • H. ErgunM. Sert
  • Computer Science
    2016 IEEE Second International Conference on Multimedia Big Data (BigMM)
  • 2016
This study investigates various aspects of convolutional neural networks (CNNs) from the big data perspective, and proposes efficient fusion mechanisms both for single and multiple network models.

Decomposing the Deep: Finding Class Specific Filters in Deep CNNs

It is demonstrated that the number of such features per class is much lower in comparison to the dimension of the final layer and therefore the decision surface of Deep CNNs lies on a low dimensional manifold and is proportional to the network depth.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

A CONVblock for Convolutional Neural Networks

The main objective of this architecture is to improve the main performances of the network thanks to a new design based on CONVblock, and demonstrates the effectiveness of the proposed method.

Convolutional Neural Network for Image Feature Extraction Based on Concurrent Nested Inception Modules

A new architecture based on Inception module is proposed, which helps the net build a more comprehensive cognition on the image and get great performance.

Gradually Updated Neural Networks for Large-Scale Image Recognition

An alternative method to increase the depth of neural networks by introducing computation orderings to the channels within convolutional layers or blocks, based on which the outputs are gradually computed in a channel-wise manner.

A fast , implementation of a deep vanilla

  • Computer Science
  • 2016
This work has shown that the accuracy obtained by deep architectures such as GoogleNet and the more contemporary ResNet on image classification and object detection tasks, proved that depth of representation is indeed the key to a successful implementation.

Deep convolutional neural networks as generic feature extractors

These findings indicate that convolutional networks are able to learn generic feature extractors that can be used for different tasks, and also indicate that the long time needed to train such deep networks is a major drawback.

Deep Pyramidal Residual Networks

This research gradually increases the feature map dimension at all units to involve as many locations as possible in the network architecture and proposes a novel residual unit capable of further improving the classification accuracy with the new network architecture.



Some Improvements on Deep Convolutional Neural Network Based Image Classification

This paper summarizes the entry in the Imagenet Large Scale Visual Recognition Challenge 2013, which achieved a top 5 classification error rate and achieved over a 20% relative improvement on the previous year's winner.

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

DeCAF, an open-source implementation of deep convolutional activation features, along with all associated network parameters, are released to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.

Scalable Object Detection Using Deep Neural Networks

This work proposes a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Visualizing and Understanding Convolutional Networks

A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

Network In Network

With enhanced local modeling via the micro network, the proposed deep network structure NIN is able to utilize global average pooling over feature maps in the classification layer, which is easier to interpret and less prone to overfitting than traditional fully connected layers.

Two-Stream Convolutional Networks for Action Recognition in Videos

This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.

Multi-scale Orderless Pooling of Deep Convolutional Activation Features

A simple but effective scheme called multi-scale orderless pooling (MOP-CNN), which extracts CNN activations for local patches at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result.

Large-Scale Video Classification with Convolutional Neural Networks

This work studies multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggests a multiresolution, foveated architecture as a promising way of speeding up the training.

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets), and establishes the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks.