• Corpus ID: 226254242

Generalized Negative Correlation Learning for Deep Ensembling

  title={Generalized Negative Correlation Learning for Deep Ensembling},
  author={Sebastian Buschj{\"a}ger and Lukas Pfahler and Katharina Morik},
Ensemble algorithms offer state of the art performance in many machine learning applications. A common explanation for their excellent performance is due to the bias-variance decomposition of the mean squared error which shows that the algorithm's error can be decomposed into its bias and variance. Both quantities are often opposed to each other and ensembles offer an effective way to manage them as they reduce the variance through a diverse set of base learners while keeping the bias low at… 

Figures from this paper

Ensemble deep learning: A review

Diversity and Generalization in Neural Network Ensembles

This work combines and expands previously published results in a theoretically sound framework that describes the relationship between diversity and ensemble performance for a wide range of ensemble methods and empirically validate this theoretical analysis with neural network ensembles.

Joint Training of Deep Ensembles Fails Due to Learner Collusion

It is discovered that joint optimization of ensemble loss results in a phenomenon in which base learners collude to artificially inflate their apparent diversity, and it is demonstrated that a balance between independent training and joint optimization can improve performance over the former while avoiding the degeneracies of the latter.

A Unified Theory of Diversity in Ensemble Learning

A theory of ensemble diversity is presented, explaining the nature and effect of diversity for a wide range of supervised learning scenarios, and it is revealed that diversity is in fact a hidden dimension in the bias-variance decomposition of an ensemble.

Ensembles of Classifiers: a Bias-Variance Perspective

A dual reparameterization of the bias-variance decomposition of Bregman divergences is introduced, and it is shown that ensembles that directly average model outputs can arbitrarily increase or decrease the bias, and that such ensemble of neural networks may reduce the bias.

Ensembling Neural Networks for Improved Prediction and Privacy in Early Diagnosis of Sepsis

This work shows that an ensemble of a few selected patient-specific models that outperforms a single model trained on much larger pooled datasets is an ideal fit for machine learning on medical data and exemplifies the framework of differentially private ensembles on the task of early prediction of sepsis.

Ensembling over Classifiers: a Bias-Variance Perspective

An empirical analysis of recent deep learning methods that ensemble over hyperparameters, revealing that these techniques indeed favor bias reduction, suggests that, contrary to classical wisdom, targeting bias reduction may be a promising direction for classifier ensembles.

There is no Double-Descent in Random Forests

This paper challenges the notion that model capacity is the correct tool to explain the success of RF and argues that the algorithm which trains the model plays a more important role than previously thought and introduces Negative Correlation Forest (NCForest) which allows for precise control over the diversity in the ensemble.

Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement

This appendix accompanies the paper ‘Improving the Accuracy-Memory Trade-Off of Random Forests Via Leaf-Refinement’. It provides results for more experiments which are not given in the paper due to

Semi-Supervised Deep Ensembles for Blind Image Quality Assessment

This work investigates a semi-supervised ensemble learning method to produce generalizable blind image quality assessment models and conducts extensive experiments to demonstrate the advantages of employing unlabeled data for BIQA, especially in model generalization and failure identification.



Why M Heads are Better than One: Training a Diverse Ensemble of Deep Networks

It is demonstrated that TreeNets can improve ensemble performance and that diverse ensembles can be trained end-to-end under a unified loss, achieving significantly higher "oracle" accuracies than classical ensembled.

Generalized Ambiguity Decompositions for Classification with Applications in Active Learning and Unsupervised Ensemble Pruning

This work generalized the classic Ambiguity Decomposition from regression problems with square loss to classification problems with any loss functions that are twice differentiable, including the logistic loss in Logistic Regression, the exponential loss in Boosting methods, and the 0-1 loss in many other classification tasks.

Snapshot Ensembles: Train 1, get M for free

This paper proposes a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost by training a single neural network, converging to several local minima along its optimization path and saving the model parameters.

Managing Diversity in Regression Ensembles

It is demonstrated that these methods control the bias-variance-covariance trade-off systematically, and can be utilised with any estimator capable of minimising a quadratic error function, for example MLPs, or RBF networks.

Understanding deep learning requires rethinking generalization

These experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data, and confirm that simple depth two neural networks already have perfect finite sample expressivity.

Diversity With Cooperation: Ensemble Methods for Few-Shot Classification

This work shows that by addressing the fundamental high-variance issue of few-shot learning classifiers, it is possible to significantly outperform current meta-learning techniques.

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

This work poses the task of producing multiple outputs as a learning problem over an ensemble of deep networks -- introducing a novel stochastic gradient descent based approach to minimize the loss with respect to an oracle.

A Unified Bias-Variance Decomposition

This article defines bias and variance for an arbitrary loss function, and shows that the resulting decomposition specializes to the standard one for the squared-loss case, and to a close relative of Kong and Dietterich’s (1995)One for the zero-one case.

Learning with Pseudo-Ensembles

A novel regularizer based on making the behavior of a pseudo-ensemble robust with respect to the noise process generating it is presented, which naturally extends to the semi-supervised setting, where it produces state-of-the-art results.

AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients

AdaBelief is proposed to simultaneously achieve three goals: fast convergence as in adaptive methods, good generalization as in SGD, and training stability; it outperforms other methods with fast convergence and high accuracy on image classification and language modeling.