• Corpus ID: 54032226

Progressive Recurrent Learning for Visual Recognition

  title={Progressive Recurrent Learning for Visual Recognition},
  author={Xutong Ren and Lingxi Xie and Chen Wei and Siyuan Qiao and Chi Su and Jiaying Liu and Alan Loddon Yuille},
Computer vision is difficult, partly because the mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn. A currently popular solution is to design a deep neural network and optimize it on a large-scale dataset. However, as the number of parameters increases, the generalization ability is often not guaranteed, e.g., the model can over-fit due to the limited amount of training data, or fail to converge because the desired function is too difficult to… 


Curriculum Learning by Transfer Learning: Theory and Experiments with Deep Networks
It is proved that the rate of convergence of an ideal curriculum learning method in monotonally increasing with the difficulty of the examples, and that this increase in convergence rate is monotonically decreasing as training proceeds.
SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data
This paper presents an approach for learning from synthesized data effectively, and enjoys higher classification accuracy, especially in the scenario of a limited number of training samples, which demonstrates its efficiency in exploring the infinite data space.
Curriculum learning
It is hypothesized that curriculum learning has both an effect on the speed of convergence of the training process to a minimum and on the quality of the local minima obtained: curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions).
Curriculum Learning for Multi-task Classification of Visual Attributes
This paper introduces a novel method to combine the advantages of both multi-task and curriculum learning in a visual attribute classification framework and demonstrates the effectiveness of this approach on the publicly available, SoBiR, VIPeR and PETA datasets.
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Auto-context and its application to high-level vision tasks
  • Z. Tu
  • Computer Science
    2008 IEEE Conference on Computer Vision and Pattern Recognition
  • 2008
An auto-context algorithm that learns an integrated low-level and context model, and is very general and easy to implement, and has the potential to be used for a wide variety of problems of multi-variate labeling.
Look and Think Twice: Capturing Top-Down Visual Attention with Feedback Convolutional Neural Networks
The background of feedbacks in the human visual cortex is introduced, which motivates the development of a computational feedback mechanism in deep neural networks, and a feedback loop is introduced to infer the activation status of hidden layer neurons according to the "goal" of the network.
ImageNet classification with deep convolutional neural networks
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
Understanding the difficulty of training deep feedforward neural networks
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.