Generalized Negative Correlation Learning for Deep Ensembling
@article{Buschjger2020GeneralizedNC, title={Generalized Negative Correlation Learning for Deep Ensembling}, author={Sebastian Buschj{\"a}ger and Lukas Pfahler and Katharina Morik}, journal={ArXiv}, year={2020}, volume={abs/2011.02952} }
Ensemble algorithms offer state of the art performance in many machine learning applications. A common explanation for their excellent performance is due to the bias-variance decomposition of the mean squared error which shows that the algorithm's error can be decomposed into its bias and variance. Both quantities are often opposed to each other and ensembles offer an effective way to manage them as they reduce the variance through a diverse set of base learners while keeping the bias low at…
Figures from this paper
10 Citations
Diversity and Generalization in Neural Network Ensembles
- Computer Science, Environmental ScienceAISTATS
- 2022
This work combines and expands previously published results in a theoretically sound framework that describes the relationship between diversity and ensemble performance for a wide range of ensemble methods and empirically validate this theoretical analysis with neural network ensembles.
Joint Training of Deep Ensembles Fails Due to Learner Collusion
- Computer Science
- 2023
It is discovered that joint optimization of ensemble loss results in a phenomenon in which base learners collude to artificially inflate their apparent diversity, and it is demonstrated that a balance between independent training and joint optimization can improve performance over the former while avoiding the degeneracies of the latter.
A Unified Theory of Diversity in Ensemble Learning
- Computer Science
- 2023
A theory of ensemble diversity is presented, explaining the nature and effect of diversity for a wide range of supervised learning scenarios, and it is revealed that diversity is in fact a hidden dimension in the bias-variance decomposition of an ensemble.
Ensembles of Classifiers: a Bias-Variance Perspective
- Computer Science
- 2022
A dual reparameterization of the bias-variance decomposition of Bregman divergences is introduced, and it is shown that ensembles that directly average model outputs can arbitrarily increase or decrease the bias, and that such ensemble of neural networks may reduce the bias.
Ensembling Neural Networks for Improved Prediction and Privacy in Early Diagnosis of Sepsis
- Computer ScienceArXiv
- 2022
This work shows that an ensemble of a few selected patient-specific models that outperforms a single model trained on much larger pooled datasets is an ideal fit for machine learning on medical data and exemplifies the framework of differentially private ensembles on the task of early prediction of sepsis.
Ensembling over Classifiers: a Bias-Variance Perspective
- Computer ScienceArXiv
- 2022
An empirical analysis of recent deep learning methods that ensemble over hyperparameters, revealing that these techniques indeed favor bias reduction, suggests that, contrary to classical wisdom, targeting bias reduction may be a promising direction for classifier ensembles.
There is no Double-Descent in Random Forests
- Computer ScienceArXiv
- 2021
This paper challenges the notion that model capacity is the correct tool to explain the success of RF and argues that the algorithm which trains the model plays a more important role than previously thought and introduces Negative Correlation Forest (NCForest) which allows for precise control over the diversity in the ensemble.
Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement
- Computer ScienceArXiv
- 2021
This appendix accompanies the paper ‘Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement’. It provides results for more experiments which are not given in the paper due to…
Semi-Supervised Deep Ensembles for Blind Image Quality Assessment
- Computer ScienceArXiv
- 2021
This work investigates a semi-supervised ensemble learning method to produce generalizable blind image quality assessment models and conducts extensive experiments to demonstrate the advantages of employing unlabeled data for BIQA, especially in model generalization and failure identification.
References
SHOWING 1-10 OF 70 REFERENCES
Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks
- Computer ScienceArXiv
- 2015
It is demonstrated that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss, achieving significantly higher "oracle" accuracies than classical ensembled.
Generalized Ambiguity Decompositions for Classification with Applications in Active Learning and Unsupervised Ensemble Pruning
- Computer ScienceAAAI
- 2017
This work generalized the classic Ambiguity Decomposition from regression problems with square loss to classification problems with any loss functions that are twice differentiable, including the logistic loss in Logistic Regression, the exponential loss in Boosting methods, and the 0-1 loss in many other classification tasks.
Snapshot Ensembles: Train 1, get M for free
- Computer ScienceICLR
- 2017
This paper proposes a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost by training a single neural network, converging to several local minima along its optimization path and saving the model parameters.
Managing Diversity in Regression Ensembles
- Computer ScienceJ. Mach. Learn. Res.
- 2005
It is demonstrated that these methods control the bias-variance-covariance trade-off systematically, and can be utilised with any estimator capable of minimising a quadratic error function, for example MLPs, or RBF networks.
Understanding deep learning requires rethinking generalization
- Computer ScienceICLR
- 2017
These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity.
Diversity With Cooperation: Ensemble Methods for Few-Shot Classification
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work shows that by addressing the fundamental high-variance issue of few-shot learning classifiers, it is possible to significantly outperform current meta-learning techniques.
Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles
- Computer ScienceNIPS
- 2016
This work poses the task of producing multiple outputs as a learning problem over an ensemble of deep networks -- introducing a novel stochastic gradient descent based approach to minimize the loss with respect to an oracle.
A Unified Bias-Variance Decomposition
- Computer Science
This article defines bias and variance for an arbitrary loss function, and shows that the resulting decomposition specializes to the standard one for the squared-loss case, and to a close relative of Kong and Dietterich’s (1995)One for the zero-one case.
Learning with Pseudo-Ensembles
- Computer ScienceNIPS
- 2014
A novel regularizer based on making the behavior of a pseudo-ensemble robust with respect to the noise process generating it is presented, which naturally extends to the semi-supervised setting, where it produces state-of-the-art results.
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients
- Computer ScienceNeurIPS
- 2020
AdaBelief is proposed to simultaneously achieve three goals: fast convergence as in adaptive methods, good generalization as in SGD, and training stability; it outperforms other methods with fast convergence and high accuracy on image classification and language modeling.