Agnostic Boosting

  title={Agnostic Boosting},
  author={Shai Ben-David and Philip M. Long and Y. Mansour},
We extend the boosting paradigm to the realistic setting of agnostic learning, that is, to a setting where the training sample is generated by an arbitrary (unknown) probability distribution over examples and labels. We define a β-weak agnostic learner with respect to a hypothesis class F as follows: given a distribution P it outputs some hypothesis h ∈ F whose error is at most erP (F) + β, where erP (F) is the minimal error of an hypothesis from F under the distribution P (note that for some… 
On agnostic boosting and parity learning
This work gives an agnostic boosting theorem that is capable of nearly achieving optimal accuracy, improving upon earlier studies and sheds light on the actual difficulty of agnostic learning by showing that full agnosticBoosting is indeed possible.
Distribution-Specific Agnostic Boosting
This work demonstrates boosting algorithms for the agnostic learning framework that only modify the distribution on the labels of the points (or, equivalently, modify the target function) and proves the connection to hard-core set constructions discovered by Klivans and Servedio (1999), which imply distribution-specific agnostic boosting algorithms.
Optimally-Smooth Adaptive Boosting and Application to Agnostic Learning
A new boosting algorithm is described that is the first such algorithm to be both smooth and adaptive, and the construction of a boosting "tandem" whose asymptotic number of iterations is the lowest possible and whose smoothness is optimal in terms of O(·).
Boosting in the Presence of Massart Noise
This work presents the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to η, and gives the first efficient Massart learner for unions of high-dimensional rectangles.
On Boosting with Polynomially Bounded Distributions
A framework is constructed which allows an algorithm to turn the distributions produced by some boosting algorithms into polynomially smooth distributions, with minimal performance loss, and demonstrates AdaBoost's application to the task of DNF learning using membership queries.
Potential-Based Agnostic Boosting
This work proves strong noise-tolerance properties of a potential-based boosting algorithm, similar to MadaBoost and SmoothBoost, and gives alternative derivations of two recent nontrivial results in computational learning theory: agnostically learning decision trees and agnOSTically learning halfspaces.
Quantum Boosting using Domain-Partitioning Hypotheses
This work designs a different quantum boosting algorithm that uses domain partitioning hypotheses that are significantly more flexible than those used in prior quantum boosting algorithms in terms of margin calculations and shows that Q-RealBoost provides a polynomial speedup over Q-AdaBoost in both the bias of the weak learner and the time taken by the weak learners to learn the target concept class.
Online Agnostic Multiclass Boosting
This work gives the first boosting algorithm for online agnostic mutli-class classification boosting and enables the construction of algorithms for statistical agnostic, online realizable, and statistical realizable multiclass boosting.
Online Agnostic Boosting via Regret Minimization
This work provides the first agnostic online boosting algorithm, which efficiently converts an arbitrary online convex optimizer to an online booster, thus unifying the 4 cases of statistical/online and agnostic/realizable boosting.
Fast agnostic classification
It is believed it is possible the KG algorithm runs in polynomial time when the inputs are normally distributed, and this algorithm would solve a notorious problem in computer science: learning logarithmically-sparse parities with noise.


Efficient Learning of Linear Perceptrons
It is proved that unless P=NP, there is no algorithm that runs in time polynomial in the sample size and in 1/µ that is µ-margin successful for all µ > 0.
Toward Efficient Agnostic Learning
An investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions is initiated, providing an initial outline of the possibilities for agnostic learning.
Toward Eecient Agnostic Learning
Results include hardness results for the most obvious generalization of the PAC model to an agnostic setting, an e cient and general agnostic learning method based on dynamic programming, relationships between loss functions for agnosticLearning, and an algorithm for a learning problem that involves hidden variables.
Eecient Learning of Linear Perceptrons
It is proved that unless P=NP, there is no algorithm that runs in time polynomial in the sample size and in 1== that is-margin successful for all > 0.5, and a new measure of success for such algorithms is introduced.
A decision-theoretic generalization of on-line learning and an application to boosting
The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.
On the difficulty of approximately maximizing agreements
Data filtering and distribution modeling algorithms for machine learning
This thesis is concerned with the analysis of algorithms for machine learning and describes and analyses an algorithm for improving the performance of a general concept learning algorithm by selecting those labeled instances that are most informative.
Design and analysis of efficient learning algorithms
  • R. Schapire
  • Computer Science
    ACM Doctoral dissertation award ; 1991
  • 1992
A theory of the learnable
This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.