Corpus ID: 236087943

A Theory of PAC Learnability of Partial Concept Classes

  title={A Theory of PAC Learnability of Partial Concept Classes},
  author={Noga Alon and Steve Hanneke and Ron Holzman and Shay Moran},
We extend the classical theory of PAC learning in a way which allows to model a rich variety of practical learning tasks where the data satisfy special properties that ease the learning process. For example, tasks where the distance of the data from the decision boundary is bounded away from zero, or tasks where the data lie on a lower dimensional surface. The basic and simple idea is to consider partial concepts: these are functions that can be undefined on certain parts of the space. When… Expand

Figures from this paper

Realizable Learning is All You Need
This work gives the first model-independent framework explaining the equivalence of realizable and agnostic learnability: a three-line blackbox reduction that simplifies, unifies, and extends the authors' understanding across a wide variety of settings. Expand
Fat-shattering dimension of k-fold maxima
This work provides a considerably sharper upper bound and a matching lower bound on the fat-shattering dimension of the k-fold maximum of real-valued function classes, achieving, in particular, an optimal dependence on k. Expand
Best lower bound on the probability of a binomial exceeding its expectation
Let $X$ be a random variable distributed according to the binomial distribution with parameters $n$ and $p$. It is shown that $P(X>EX)\ge1/4$ if $1>p\ge c/n$, where $c:=\ln(4/3)$, the best possibleExpand


Agnostic Online Learning
This work describes several models of non-realizable data and derive upper and lower bounds on the achievable regret, and extends Littlestone's theory to include margin-based hypothesis classes, in which the prediction of each hypothesis is accompanied by a confidence value. Expand
Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions
A general scheme for extending the VC-dimension to the case n > 1 is presented, which defines a wide variety of notions of dimension in which all these variants of theVC-dimension, previously introduced in the context of learning, appear as special cases. Expand
Relating Data Compression and Learnability
It is demonstrated that the existence of a suitable data compression scheme is sufficient to ensure learnability and the introduced compression scheme provides a rigorous model for studying data compression in connection with machine learning. Expand
Distance-Based Classification with Lipschitz Functions
The goal of this article is to find a generalization of linear decision functions for metric spaces and define a corresponding notion of margin such that the decision function separates the training points with a large margin. Expand
Algorithmic Luckiness
This paper studies learning algorithms more directly and in a way that allows us to exploit the serendipity of the training sample and presents an application of this framework to the maximum margin algorithm for linear classifiers which results in a bound that exploits the margin. Expand
Structural Risk Minimization Over Data-Dependent Hierarchies
A result is presented that allows one to trade off errors on the training sample against improved generalization performance, and a more general result in terms of "luckiness" functions, which provides a quite general way for exploiting serendipitous simplicity in observed data to obtain better prediction accuracy from small training sets. Expand
Uniform convergence may be unable to explain generalization in deep learning
Through numerous experiments, doubt is cast on the power of uniform convergence-based generalization bounds to provide a complete picture of why overparameterized deep networks generalize well. Expand
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm
  • N. Littlestone
  • Mathematics
  • 28th Annual Symposium on Foundations of Computer Science (sfcs 1987)
  • 1987
Valiant (1984) and others have studied the problem of learning various classes of Boolean functions from examples. Here we discuss incremental learning of these functions. We consider a setting inExpand
Supervised learning through the lens of compression
This work uses the compressibility-learnability equivalence to show that (i) for multiclass categorization, PAC and agnostic PAC learnability are equivalent, and (ii) to derive a compactness theorem for learnability. Expand
Adversarial laws of large numbers and optimal regret in online classification
The sequential sampling model proposed by Ben-Eliezer and Yogev (2020) is considered, and the classes which admit a uniform law of large numbers in this model are characterized: these are exactly the classes that are online learnable. Expand