Boosting Nearest Neighbors for the Efficient Estimation of Posteriors

@inproceedings{DAmbrosio2012BoostingNN,
  title={Boosting Nearest Neighbors for the Efficient Estimation of Posteriors},
  author={Roberto D'Ambrosio and Richard Nock and Wafa Bel Haj Ali and Frank Nielsen and Michel Barlaud},
  booktitle={ECML/PKDD},
  year={2012}
}
It is an admitted fact that mainstream boosting algorithms like AdaBoost do not perform well to estimate class conditional probabilities. In this paper, we analyze, in the light of this problem, a recent algorithm, unn, which leverages nearest neighbors while minimizing a convex loss. Our contribution is threefold. First, we show that there exists a subclass of surrogate losses, elsewhere called balanced, whose minimization brings simple and statistically efficient estimators for Bayes… 
Gentle Nearest Neighbors Boosting over Proper Scoring Rules
TLDR
A lightweight Newton-Raphson alternative optimizing proper scoring rules from a very broad set is proposed, and formal convergence rates under the boosting framework that compete with those known for UNN are established.
Fast Newton Nearest Neighbors Boosting For Image Classification
TLDR
It is shown that the Adaptive Newton-Raphson scheme to leverage k-NN, N3 is a boosting algorithm, with several key algorithmic and statistical properties, and may be sufficient to boost a subsample to reach desired bounds for the loss at hand in the boosting framework.
Boosting Stochastic Newton with Entropy Constraint for Large-Scale Image Classification
TLDR
This paper considers the minimization of specific calibrated losses, for which a Boosting Stochastic Newton Descent method is proposed for minimization in the primal space of these specific calibrated loss, and shows how to reliably estimate posteriors, binary entropy and margin.
Boosting Stochastic Newton Descent for Bigdata large scale classification
TLDR
Boosting Stochastic Newton Descent method for minimization of any calibrated loss in the primal space, which approximates the inverse Hessian by the best low-rank approximation and provides further core optimization for fast convergence.
Constrained Convex Neyman-Pearson Classification Using an Outer Approximation Splitting Method
TLDR
This paper presents an efficient splitting algorithm for Neyman-Pearson classification that alternates a gradient step on the objective surrogate risk and an approximate projection step onto the constraint set, implemented by means of an outer approximation subgradient projection algorithm.
Biomedical Images Classification by Universal Nearest Neighbours Classifier Using Posterior Probability
TLDR
It is shown that the application of the Universal Nearest Neighbours algorithm in conjunction with a reconstruction rule based on the posterior probabilities provides a classification scheme robust among different biomedical image datasets.
Hybrid weighted-stego detection using machine learning
TLDR
A new Hybrid Weighted Stego-detection (HWS) algorithm which benefits from two fundamental advantages: the posterior probability of detection is well estimated and the numerical complexity of the algorithm is linear with the number of samples and the dimension of the features.
Handling imbalanced datasets by reconstruction rules in decomposition schemes. (Classification de bases de données déséquilibrées par des règles de décomposition)
TLDR
Une regle de reconstitution de donnees statistiques concue pour le ‘One-per-Class’ et ‘Pair-Wise Coupling’ des approches sur the decomposition, nous avons developpe une regle qui s’applique a la regression ‘softmax’ sur la fiabilite afin d’evaluer la classification finale.

References

SHOWING 1-10 OF 21 REFERENCES
Leveraging Κ-nn for generic classification boosting
Special Invited Paper-Additive logistic regression: A statistical view of boosting
TLDR
This work shows that this seemingly mysterious phenomenon of boosting can be understood in terms of well-known statistical principles, namely additive modeling and maximum likelihood, and develops more direct approximations and shows that they exhibit nearly identical results to boosting.
On the Efficient Minimization of Classification Calibrated Surrogates
TLDR
An algorithm is proposed which provably minimizes any classification calibrated surrogate strictly convex and differentiable — a set whose losses span the exponential, logistic and squared losses —, with boosting-type guaranteed convergence rates under a weak learning assumption.
Stopping Criterion for Boosting-Based Data Reduction Techniques: from Binary to Multiclass Problem
TLDR
The aim of the present paper is to relax the class constraint, and extend the contribution to multiclass problems, showing the benefits that the boosting-derived weighting rule brings to weighted nearest neighbor classifiers.
Bregman Divergences and Surrogates for Learning
  • R. Nock, F. Nielsen
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2009
TLDR
This paper addresses the problem for a wide set which lies at the intersection of classification calibrated surrogates and those of Murata et al. (2004), and gives a minimization algorithm provably converging to the minimum of any such surrogate.
Boosting k-NN for Categorization of Natural Scenes
TLDR
A novel boosting approach for generalizing the k-NN rule, by providing a new k-nn boosting algorithm, called UNN (Universal Nearest Neighbors), for the induction of leveragedk-NN, which displays the ability of UNN to compete with or beat the other contenders, while achieving comparatively small training and testing times.
Convexity, Classification, and Risk Bounds
TLDR
A general quantitative relationship between the risk as assessed using the 0–1 loss and the riskAs assessed using any nonnegative surrogate loss function is provided, and it is shown that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function.
Improved Boosting Algorithms Using Confidence-rated Predictions
We describe several improvements to Freund and Schapire's AdaBoost boosting algorithm, particularly in a setting in which hypotheses may assign confidences to each of their predictions. We give a
Boosting the margin: A new explanation for the effectiveness of voting methods
TLDR
It is shown that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error.
On the boosting ability of top-down decision tree learning algorithms
TLDR
This work analyzes the performance of top-down algorithms for decision tree learning and proves that some popular and empirically successful heuristics that are based on first principles meet the criteria of an independently motivated theoretical model.
...
...