• Corpus ID: 51888895

Using Feature Grouping as a Stochastic Regularizer for High-Dimensional Noisy Data

@article{Aydre2018UsingFG,
title={Using Feature Grouping as a Stochastic Regularizer for High-Dimensional Noisy Data},
author={Serg{\"u}l Ayd{\"o}re and Bertrand Thirion and Olivier Grisel and Ga{\"e}l Varoquaux},
journal={ArXiv},
year={2018},
volume={abs/1807.11718}
}
• Published 31 July 2018
• Computer Science
• ArXiv
The use of complex models --with many parameters-- is challenging with high-dimensional small-sample problems: indeed, they face rapid overfitting. Such situations are common when data collection is expensive, as in neuroscience, biology, or geology. Dedicated regularization can be crafted to tame overfit, typically via structured penalties. But rich penalties require mathematical expertise and entail large computational costs. Stochastic regularizers such as dropout are easier to implement…

Figures and Tables from this paper

ASNI: Adaptive Structured Noise Injection for shallow and deep neural networks

• Computer Science
ArXiv
• 2019
This work proposes a generalisation of dropout and other multiplicative noise injection schemes for shallow and deep neural networks, where the random noise applied to different units is not independent but follows a joint distribution that is either fixed or estimated during training.

Adaptive structured noise injection for shallow and deep neural networks

• Computer Science
• 2019
This work proposes a generalisation of dropout and other multiplicative noise injection schemes for shallow and deep neural networks, where the random noise applied to different units is not independent but follows a joint distribution that is either fixed or estimated during training.

Interpretable LSTMs For Whole-Brain Neuroimaging Analyses

• Computer Science
ArXiv
• 2018
The DLight framework is introduced, which overcomes challenges by utilizing a long short-term memory unit (LSTM) based deep neural network architecture to analyze the spatial dependency structure of whole-brain fMRI data and which outperforms conventional decoding approaches, while still detecting physiologically appropriate brain areas for the cognitive states classified.

Towards a Deep Network Architecture for Structured Smoothness

• Computer Science
ICLR
• 2020
We propose the Fixed Grouping Layer (FGL); a novel feedforward layer designed to incorporate the inductive bias of structured smoothness into a deep learning model. FGL achieves this goal by

On-line Data Analysis and Optimization of Crankshaft Dynamic Balance

• Materials Science, Engineering
IOP Conference Series: Materials Science and Engineering
• 2019
The crankshaft is a high-speed rotating component that is widely used in many fields. Dynamic balance of the crankshaft is an important process stage that affects the quality of the crankshaft. In

References

SHOWING 1-10 OF 43 REFERENCES

Adam: A Method for Stochastic Optimization

• Computer Science
ICLR
• 2015
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Adding noise to the input of a model trained with a regularized objective

• Mathematics, Computer Science
ArXiv
• 2011
This work derives the higher order terms of the Taylor expansion and analyzes the coefficients of the regularization terms induced by the noisy input of a parametric function to study the effect of penalizing the Hessian of the mapping function with respect to the input in terms of generalization performance.

Training robust models using Random Projection

• Computer Science
2016 23rd International Conference on Pattern Recognition (ICPR)
• 2016
This paper shows how robust neural networks can be trained using random projection, and shows that while random projection acts as a strong regularizer, boosting model accuracy similar to other regularizers, it is far more robust to adversarial noise and fooling samples.

Structured sparsity through convex optimization

• Computer Science
ArXiv
• 2011
It is shown that the $\ell_1$-norm can be extended to structured norms built on either disjoint or overlapping groups of variables, leading to a flexible framework that can deal with various structures.

Random Projections as Regularizers: Learning a Linear Discriminant Ensemble from Fewer Observations than Dimensions

• Computer Science
ACML
• 2013
It is shown that the randomly-projected ensemble is equivalent to implementing a sophisticated regularization scheme to the linear discriminant learned in the original data space and this prevents overfitting in conditions of small sample size where pseudo-inverse FLD learned inThe data space is provably poor.

Recursive Nearest Agglomeration (ReNA): Fast Clustering for Approximation of Structured Signals

• Computer Science
IEEE Transactions on Pattern Analysis and Machine Intelligence
• 2019
This work contributes a linear-time agglomerative clustering scheme, Recursive Nearest Agglomeration (ReNA), that approximates the data as well as traditional variance-minimizing clustering schemes that have a quadratic complexity, and shows that it can remove noise, improving subsequent analysis steps.

The composite absolute penalties family for grouped and hierarchical variable selection

• Computer Science
• 2009
CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with $p\gg n$ and possibly mis-specified groupings, and iCAP is seen to be parsimonious in the experiments.

Dropout: a simple way to prevent neural networks from overfitting

• Computer Science
J. Mach. Learn. Res.
• 2014
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

Efficient clustering of high-dimensional data sets with application to reference matching

• Computer Science
KDD '00
• 2000
This work presents a new technique for clustering large datasets, using a cheap, approximate distance measure to eciently divide the data into overlapping subsets the authors call canopies, and presents ex- perimental results on grouping bibliographic citations from the reference sections of research papers.