Analysis of Dropout Learning Regarded as Ensemble Learning

@inproceedings{Hara2016AnalysisOD,
  title={Analysis of Dropout Learning Regarded as Ensemble Learning},
  author={Kazuyuki Hara and Daisuke Saitoh and Hayaru Shouno},
  booktitle={International Conference on Artificial Neural Networks},
  year={2016}
}
Deep learning is the state-of-the-art in fields such as visual object recognition and speech recognition. This learning uses a large number of layers, huge number of units, and connections. Therefore, overfitting is a serious problem. To avoid this problem, dropout learning is proposed. Dropout learning neglects some inputs and hidden units in the learning process with a probability, p, and then, the neglected inputs and hidden units are combined with the learned network to express the final… 

Regularized Deep Convolutional Neural Networks for Feature Extraction and Classification

It is shown that with the right combination of applied regularization techniques such as fully connected dropout, max pooling drop out, L2 regularization and He initialization, it is possible to achieve good results in object recognition with small networks and without data augmentation.

An ETF view of Dropout regularization

A new interpretation of Dropout from a frame theory perspective is provided that for a certain family of autoencoders with a linear encoder, optimizing the encoder with dropout regularization leads to an equiangular tight frame (ETF).

Input Image Pixel Interval method for Classification Using Transfer Learning

This study introduces input image preprocessing, an enhanced neural network optimization method, and prediction probability ensemble to minimize the number of trainable parameters but maintain the outcome accuracy.

“In-Network Ensemble”: Deep Ensemble Learning with Diversified Knowledge Distillation

The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.

An Introduction to Deep Learning

This chapter aims to briefly introduce the fundamentals for deep learning, which is the key component of deep reinforcement learning, and gradually progress to much more complex but powerful architectures such as convolutional neural networks (CNNs) and recurrent neural Networks (RNNs).

Dropout Prediction Variation Estimation Using Neuron Activation Strength

This approach provides an inference-once alternative to estimate dropout prediction variation as an auxiliary task and demonstrates that using activation features from a subset of the neural network layers can be sufficient to achieve variation estimation performance almost comparable to that of usingactivation features from all layers, thus reducing resources even further for variation estimation.

Exploring Dropout Discriminator for Domain Adaptation

On ensemble techniques of weight-constrained neural networks

The proposed models are based on Bagging and Boosting, which constitute two of the most popular strategies, to efficiently combine the predictions of WCNN classifiers, providing empirical evidence that the hybridization of ensemble learning and WCNNs can build efficient and powerful classification models.

Data-dependence of plateau phenomenon in learning with neural network—statistical mechanical analysis

It is shown that the data whose covariance has small and dispersed eigenvalues tend to make the plateau phenomenon inconspicuous, which is a gap between theory and reality.

Proactive Minimization of Convolutional Networks

This method minimizes the neural network by omitting convolutional kernels during the training process, while also keeping the quality of the results high.

References

SHOWING 1-10 OF 10 REFERENCES

Improving neural networks by preventing co-adaptation of feature detectors

When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Deep Learning

Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

Dropout Training as Adaptive Regularization

By casting dropout as regularization, this work develops a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer and consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.

Learning by on-line gradient descent

On-line gradient-descent learning in multilayer networks analytically and numerically and for architectures with hidden layers and fixed hidden-to-output weights, such as the parity and the committee machine, is studied.

A Fast Learning Algorithm for Deep Belief Nets

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

Recent advances in deep learning for speech research at Microsoft

  • L. DengJinyu Li A. Acero
  • Computer Science
    2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013
An overview of the work by Microsoft speech researchers since 2009 is provided, focusing on more recent advances which shed light to the basic capabilities and limitations of the current deep learning technology.

Ensemble Learning of Linear Perceptrons: On-Line Learning Theory

This work analyzes ensemble learning including the noisy case where teacher or student noise is present and analyzes the homogeneous correlation of linear perceptrons used as teacher and student.

Pattern Recognition and Machine Learning

Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.

On-line learning in soft committee machines.

  • SaadSolla
  • Computer Science
    Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics
  • 1995