• Corpus ID: 40546306

The Hybrid Bootstrap: A Drop-in Replacement for Dropout

@article{Kosar2018TheHB,
  title={The Hybrid Bootstrap: A Drop-in Replacement for Dropout},
  author={Robert Kosar and David W. Scott},
  journal={ArXiv},
  year={2018},
  volume={abs/1801.07316}
}
Regularization is an important component of predictive model building. The hybrid bootstrap is a regularization technique that functions similarly to dropout except that features are resampled from other training points rather than replaced with zeros. We show that the hybrid bootstrap offers superior performance to dropout. We also present a sampling based technique to simplify hyperparameter choice. Next, we provide an alternative sampling technique for convolutional neural networks. Finally… 
Deep learning for tabular data : an exploratory study
TLDR
Deep Learning for Tabular Data: An Exploratory Study shows the power of data representation to improve the quality of learning in the real world.

References

SHOWING 1-10 OF 24 REFERENCES
Dropout Training as Adaptive Regularization
TLDR
By casting dropout as regularization, this work develops a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer and consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.
DART: Dropouts meet Multiple Additive Regression Trees
TLDR
A novel way of employing dropouts in MART is proposed, resulting in the DART algorithm, which outperforms MART in each of the tasks, with a significant margin, and overcomes the issue of over-specialization to a considerable extent.
Regularization of Neural Networks using DropConnect
TLDR
This work introduces DropConnect, a generalization of Dropout, for regularizing large fully-connected layers within neural networks, and derives a bound on the generalization performance of both Dropout and DropConnect.
Dropout: a simple way to prevent neural networks from overfitting
TLDR
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Learning with Marginalized Corrupted Features
TLDR
This work proposes to corrupt training examples with noise from known distributions within the exponential family and presents a novel learning algorithm, called marginalized corrupted features (MCF), that trains robust predictors by minimizing the expected value of the loss function under the corrupting distribution.
Data Augmentation by Pairing Samples for Images Classification
  • H. Inoue
  • Mathematics, Computer Science
    ArXiv
  • 2018
TLDR
This paper introduces a simple but surprisingly effective data augmentation technique for image classification tasks, named SamplePairing, which significantly improved classification accuracy for all the tested datasets and is more valuable for tasks with a limited amount of training data, such as medical imaging tasks.
Training with Noise is Equivalent to Tikhonov Regularization
TLDR
This paper shows that for the purposes of network training, the regularization term can be reduced to a positive semi-definite form that involves only first derivatives of the network mapping.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
Wide Residual Networks
TLDR
This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts.
Greedy function approximation: A gradient boosting machine.
Function estimation/approximation is viewed from the perspective of numerical optimization in function space, rather than parameter space. A connection is made between stagewise additive expansions
...
1
2
3
...