The strength of weak learnability

  title={The strength of weak learnability},
  author={Robert E. Schapire},
  journal={Machine Learning},
  • R. Schapire
  • Published 30 October 1989
  • Computer Science
  • Machine Learning
This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distribution-free (PAC) learning model. A concept class islearnable (orstrongly learnable) if, given access to a source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class isweakly learnable if the learner can produce an hypothesis that… 

Boosting a weak learning algorithm by majority

An algorithm for improving the accuracy of algorithms for learning binary concepts by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples, is presented.

Conservativeness and monotonicity for learning algorithms

In this extended abstract, it is shown that the converse does not hold by giving a PAClearning algorithm that is not a weak Occam algorithm, and that, under some natural conditions, a monotone PAC-learning algorithm for a hypothesis class can be transformed to a weakOccam algorithm without changing the hypothesis class.

Computing Optimal Hypotheses Efficiently for Boosting

  • S. Morishita
  • Computer Science
    Progress in Discovery Science
  • 2002
This paper sheds light on a strong connection between AdaBoost and several optimization algorithms for data mining and considers several classes of simple but expressive hypotheses such as ranges and regions for numeric attributes, subsets of categorical values, and conjunctions of Boolean tests.

Cryptographic hardness of distribution-specific learning

It is shown that under appropriate assumptions on the hardness of factoring, the learnability of Boolean formulas and constant depth threshold circuits on any distribution is characterized by the distribution’s Renyi entropy.

Mutual Information Gaining Algorithm and Its Relation to PAC-Learning Algorithm

In this paper, the mutual information between a target concept and a hypothesis is used to measure the goodness of the hypothesis rather than the accuracy, and a notion of mutual information gaining

Prediction-Preserving Reducibility

The Beneficial Effects of Using Multi-net Systems That Focus on Hard Patterns

This paper uses a novel technique to illustrate how Adaboost effectively focuses its training in the regions near the decision border, and proposes a new method for training multi net systems that shares this property withAdaboost.

Query based hybrid learning models for adaptively adjusting locality

The hybrid models can achieve a better compromise between capacity and locality, and hybrid models outperform both global learning and local learning in a typical learning problem-spam filtering.

Improving BAS Committee with ETL Voting

This work presents ETL Voting BAS Committee, a scheme that combines ETL and BAS Committee in order to determine the best combination for the classifiers of the ensemble.



Computational complexity of machine learning

  • M. Kearns
  • Computer Science
    ACM distinguished dissertations
  • 1990
A centerpiece of the thesis is a series of results demonstrating the computational difficulty of learning a number of well-studied concept classes by reducing some apparently hard number-theoretic problems from cryptography to the learning problems.

On the necessity of Occam algorithms

It is shown for many natural concept classes that the PAC-learnability of the class implies the existence of an Occam algorithm for the class, and an interpretation of these results is that for many classes, PAC- learnability is equivalent to data compression.

On learning a union of half spaces

  • E. Baum
  • Computer Science, Mathematics
    J. Complex.
  • 1990

A theory of the learnable

This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.

On the learnability of Boolean formulae

The goals are to prove results and develop general techniques that shed light on the boundary between the classes of expressions that are learnable in polynomial time and those that are apparently not, and to employ the distribution-free model of learning.

Learning decision trees from random examples

Learning Nested Differences of Intersection-Closed Concept Classes

A new framework for constructing learning algorithms which involve master algorithms which use learning algorithms for intersection-closed concept classes as subroutines and show that these algorithms are optimal or nearly optimal with respect to several different criteria.

Computational limitations on learning from examples

It is shown for various classes of concept representations that these cannot be learned feasibly in a distribution-free sense unless R = NP, and relationships between learning of heuristics and finding approximate solutions to NP-hard optimization problems are given.