Does Unlabeled Data Provably Help? Worst-case Analysis of the Sample Complexity of Semi-Supervised Learning

@inproceedings{BenDavid2008DoesUD,
  title={Does Unlabeled Data Provably Help? Worst-case Analysis of the Sample Complexity of Semi-Supervised Learning},
  author={Shai Ben-David and Tyler Lu and D{\'a}vid P{\'a}l},
  booktitle={COLT},
  year={2008}
}
We study the potential benefits to classification prediction that arise from having access to unlabeled samples. We compare learning in the semi-supervised model to the standard, supervised PAC (distribution free) model, considering both the realizable and the unrealizable (agnostic) settings. Roughly speaking, our conclusion is that access to unlabeled samples cannot provide sample size guarantees that are better than those obtainable without access to unlabeled data, unless one postulates… CONTINUE READING
Highly Cited
This paper has 93 citations. REVIEW CITATIONS

3 Figures & Tables

Topics

Statistics

0102020082009201020112012201320142015201620172018
Citations per Year

93 Citations

Semantic Scholar estimates that this publication has 93 citations based on the available data.

See our FAQ for additional information.