A Geometric Approach to Sample Compression
@article{Rubinstein2009AGA, title={A Geometric Approach to Sample Compression}, author={Benjamin I. P. Rubinstein and J. Hyam Rubinstein}, journal={J. Mach. Learn. Res.}, year={2009}, volume={13}, pages={1221-1261} }
The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer's Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via…
Figures from this paper
36 Citations
Bounding Embeddings of VC Classes into Maximum Classes
- Mathematics, Computer ScienceArXiv
- 2014
It is shown that maximum classes can be characterised by a local-connectivity property of the graph obtained by viewing the class as a cubical complex, and a negative embedding result is proved which demonstrates VC-d classes that cannot be embedded in any maximum class of VC dimension lower than 2d.
Labeled Compression Schemes for Extremal Classes
- Mathematics, Computer ScienceALT
- 2016
The key result of the paper is a construction of a sample compression scheme for extremal classes of size equal to their VC dimension, based on a powerful generalization of the Sauer-Shelah bound called the Sandwich Theorem.
Unlabeled sample compression schemes and corner peelings for ample and maximum classes
- MathematicsICALP
- 2019
Unlabelled Sample Compression Schemes for Intersection-Closed Classes and Extremal Classes
- Mathematics, Computer ScienceArXiv
- 2022
This paper proves that all intersection-closed classes with VC dimension d admit unlabelled compression schemes of size at most 11 d, and simplifies and extends their proof technique to deal with so-called extremal classes of VC Dimension d which contain maximum classes ofVC dimension d − 1.
Labeled sample compression schemes for complexes of oriented matroids
- MathematicsSSRN Electronic Journal
- 2022
It is shown that the topes of a complex of oriented matroids (abbreviated COM) of VC-dimension d admit a proper labeled sample compression scheme of size d, which is a step towards the sample compression conjecture.
Sample compression schemes for VC classes
- Computer Science2016 Information Theory and Applications Workshop (ITA)
- 2016
It is shown that every concept class C with VC dimension d has a sample compression scheme of size exponential in d, and an approximate minimax phenomenon for binary matrices of low VC dimension is used, which may be of interest in the context of game theory.
Honest Compressions and Their Application to Compression Schemes
- Mathematics, Computer ScienceCOLT
- 2013
This work proves the existence of such compression schemes under stronger assumptions than nite VCdimension in concept classes dened by hyperplanes, polynomials, exponentials, restricted analytic functions and compositions, additions and multiplications of all of the above.
Unlabeled sample compression schemes for oriented matroids
- Mathematics
- 2022
A long-standing sample compression conjecture asks to linearly bound the size of the optimal sample compression schemes by the Vapnik-Chervonenkis (VC) dimension of an arbitrary class. In this paper,…
Compressing and Teaching for Low VC-Dimension
- Computer Science2015 IEEE 56th Annual Symposium on Foundations of Computer Science
- 2015
This work shows that given an arbitrary set of labeled examples from an unknown concept in C, one can retain only a subset of exp(d) of them, in a way that allows to recover the labels of all other examples in the set, using additional exp( d) information bits.
References
SHOWING 1-10 OF 32 REFERENCES
Geometric & Topological Representations of Maximum Classes with Applications to Sample Compression
- MathematicsCOLT
- 2008
Finite maximum classes are systematically investigated, showing that d-maximum classes corresponding to PL hyperplane arrangements in R have cubical complexes homeomorphic to a d-ball, or equivalently complexes that are manifolds with boundary.
Space-bounded learning and the Vapnik-Chervonenkis dimension
- MathematicsCOLT '89
- 1989
Vapnik-Chervonenkis dimension and (pseudo-)hyperplane arrangements
- MathematicsDiscret. Comput. Geom.
- 1994
The correspondence to arrangements is obtained indirectly via a new characterization of uniforom oriented matroids: a range space (X, ℛ) naturally corresponds to a uniform oriented matroid of rank |X|—d if and only if its VC-dimension impliesX - R∈ℛ, and |ℚ| is maximum under these conditions.
Relating Data Compression and Learnability
- Computer Science
- 2003
It is demonstrated that the existence of a suitable data compression scheme is sufficient to ensure learnability and the introduced compression scheme provides a rigorous model for studying data compression in connection with machine learning.
Shifting: One-inclusion mistake bounds and sample compression
- Computer Science, MathematicsJ. Comput. Syst. Sci.
- 2009
Shifting, One-Inclusion Mistake Bounds and Tight Multiclass Expected Risk Bounds
- Computer ScienceNIPS
- 2006
A density bound of n (≦n-1d-1) / (≤nd) < d, which positively resolves a conjecture of Kuzmin & Warmuth relating to their unlabeled Peeling compression scheme and also leads to an improved mistake bound for the randomized (deterministic) one-inclusion strategy for all d.
Learnability and the Vapnik-Chervonenkis dimension
- Computer ScienceJACM
- 1989
This paper shows that the essential condition for distribution-free learnability is finiteness of the Vapnik-Chervonenkis dimension, a simple combinatorial parameter of the class of concepts to be learned.
Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension
- MathematicsJ. Comb. Theory, Ser. A
- 1995
Combinatorial Variability of Vapnik-chervonenkis Classes with Applications to Sample Compression Schemes
- Computer Science, MathematicsDiscret. Appl. Math.
- 1998
A Compression Approach to Support Vector Model Selection
- Computer ScienceJ. Mach. Learn. Res.
- 2004
Inspired by several generalization bounds, "compression coefficients" for SVMs are constructed which measure the amount by which the training labels can be compressed by a code built from the separating hyperplane and can fairly accurately predict the parameters for which the test error is minimized.