SORT: Second-Order Response Transform for Visual Recognition

@article{Wang2017SORTSR,
  title={SORT: Second-Order Response Transform for Visual Recognition},
  author={Yan Wang and Lingxi Xie and Chenxi Liu and Ya Zhang and Wenjun Zhang and Alan Loddon Yuille},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={1368-1377}
}
  • Yan Wang, Lingxi Xie, A. Yuille
  • Published 20 March 2017
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision (ICCV)
In this paper, we reveal the importance and benefits of introducing second-order operations into deep neural networks. [] Key Method Moreover, SORT augments the family of transform operations and increases the nonlinearity of the network, making it possible to learn flexible functions to fit the complicated distribution of feature space.

Figures and Tables from this paper

Second-Order Response Transform Attention Network for Image Classification
TLDR
This work proposes a novel Second-order Response Transform Attention Network (SoRTA-Net) for classification tasks, which can be flexibly inserted into existing CNNs without any modification of network topology.
Detachable Second-Order Pooling: Toward High-Performance First-Order Networks.
TLDR
This work presents a novel architecture, namely a detachable second-order pooling network, to leverage the advantage of second- order pooling by first-order networks while keeping the model complexity unchanged during inference.
Global Second-Order Pooling Convolutional Networks
TLDR
A novel network model introducing GSoP across from lower to higher layers for exploiting holistic image information throughout a network to make full use of the second-order statistics of the holistic image Throughout a network is proposed.
Second-order Attention Guided Convolutional Activations for Visual Recognition
TLDR
This work makes an attempt to combine deep second-order statistics with attention mechanisms in ConvNets, and further proposes a novel Second-order Attention Guided Network (SoAG-Net) for visual recognition that outperforms its counterparts and achieves competitive performance with state-of theart models under the same backbone.
Second-order convolutional network for crowd counting
TLDR
This paper proposes a novel architecture referred to Second-Order Convolutional Network (SOCN) to deal with single image crowd counting from the perspective of improving the feature transformation capability of the network.
Gradually Updated Neural Networks for Large-Scale Image Recognition
TLDR
An alternative method to increase the depth of neural networks by introducing computation orderings to the channels within convolutional layers or blocks, based on which the outputs are gradually computed in a channel-wise manner.
Embedding Attention and Residual Network for Accurate Salient Object Detection
TLDR
An efficient fully convolutional salient object detection network is presented and attention weight is employed in a top-down manner which can bridge high level semantic information to help shallow layers better locate salient objects and also filter out noisy response in the background region.
Propagation Mechanism for Deep and Wide Neural Networks
  • Dejiang Xu, M. Lee, W. Hsu
  • Computer Science
    2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
This paper proposes a new propagation mechanism called channel-wise addition (cAdd) to deal with the vanishing gradients problem without sacrificing the complexity of the learned features, and is able to eliminate the need to store feature maps thus reducing the memory requirement.
Two-Level Attentions and Grouping Attention Convolutional Network for Fine-Grained Image Classification
TLDR
A clustering-based grouping attention model, which implies the part-level attention, and implements the functions of group convolution and feature clustering, which can greatly reduce the network parameters and improve the recognition rate and interpretability of the network.
Few-Shot Image Recognition by Predicting Parameters from Activations
TLDR
A novel method that can adapt a pre-trained neural network to novel categories by directly predicting the parameters from the activations is proposed, which achieves the state-of-the-art classification accuracy on Novel categories by a significant margin while keeping comparable performance on the large-scale categories.
...
...

References

SHOWING 1-10 OF 83 REFERENCES
Aggregated Residual Transformations for Deep Neural Networks
TLDR
On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity.
Deep Collaborative Learning for Visual Recognition
TLDR
This work forms the function of a convolutional layer as learning a large visual vocabulary, and proposes an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity.
Towards Reversal-Invariant Image Representation
TLDR
This paper designs a reversal-invariant version of SIFT descriptor named Max-SIFT, a generalized RIDE algorithm which can be applied to a large family of local descriptors which reveals consistent accuracy gain on various image classification tasks, including scene understanding, fine-grained object recognition, and large-scale visual recognition.
Going deeper with convolutions
We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition
Attention to Scale: Scale-Aware Semantic Image Segmentation
TLDR
An attention mechanism that learns to softly weight the multi-scale features at each pixel location is proposed, which not only outperforms averageand max-pooling, but allows us to diagnostically visualize the importance of features at different positions and scales.
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
InterActive: Inter-Layer Activeness Propagation
TLDR
InterActive is presented, a novel algorithm which computes the activeness of neurons and network connections, and achieves state-of-the-art classification performance on a wide range of image datasets.
Factorized Bilinear Models for Image Recognition
TLDR
A novel Factorized Bilinear (FB) layer is proposed to model the pairwise feature interactions by considering the quadratic terms in the transformations of CNNs to reduce the risk of overfitting.
Bilinear CNN Models for Fine-Grained Visual Recognition
We propose bilinear models, a recognition architecture that consists of two feature extractors whose outputs are multiplied using outer product at each location of the image and pooled to obtain an
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
TLDR
DeCAF, an open-source implementation of deep convolutional activation features, along with all associated network parameters, are released to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.
...
...