• Corpus ID: 243847809

Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise

@inproceedings{Diakonikolas2020NearOptimalSQ,
  title={Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise},
  author={Ilias Diakonikolas and Daniel M. Kane},
  year={2020}
}
We study the problem of PAC learning halfspaces with Massart noise. Given labeled samples (x, y) from a distribution D on Rd×{±1} such that the marginalDx on the examples is arbitrary and the label y of example x is generated from the target halfspace corrupted by a Massart adversary with flipping probability η(x) ≤ η ≤ 1/2, the goal is to compute a hypothesis with small misclassification error. The best known poly(d, 1/ǫ)-time algorithms for this problem achieve error of η + ǫ, which can be… 
3 Citations
Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise
TLDR
Tight statistical query lower bounds for learnining halfspaces in the presence of Massart noise are given and it is shown that for arbitrary ∈ [0, 1/2] every SQ algorithm achieving misclassification error better than requires queries of super polynomial accuracy or at least a superpolynomial number of queries.
Efficient PAC Learning from the Crowd with Pairwise Comparison
TLDR
A label-efficient algorithm that interleaves learning and annotation, which leads to a constant overhead of the algorithm (a notion that characterizes the query complexity) in contrast, a natural approach of annotation followed by learning leads to an overhead growing with the sample size.
Non-Gaussian Component Analysis via Lattice Basis Reduction
TLDR
A sample and computationally efficient algorithm for NGCA in the regime that A is discrete or nearly discrete, in a well-defined technical sense is obtained.

References

SHOWING 1-10 OF 71 REFERENCES
Hardness of Learning Halfspaces with Massart Noise
TLDR
There is an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a polynomial-time SQ algorithm, and this lower bound implies that no efficient SQ algorithm can approximate the optimal error within anyPolynomial factor.
Complexity theoretic limitations on learning halfspaces
TLDR
It is shown that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that Err_H(D) <= eta for arbitrarily small constant eta>0, and that D is supported in the Boolean cube.
Hardness of Learning Halfspaces with Noise
  • V. Guruswami, P. Raghavendra
  • Computer Science, Mathematics
    2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
  • 2006
TLDR
It is proved that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense, and a strong hardness is obtained for another basic computational problem: solving a linear system over the rationals.
Boosting in the Presence of Massart Noise
TLDR
This work presents the first computationally efficient boosting algorithm in the presence of Massart noise that achieves misclassification error arbitrarily close to η, and gives the first efficient Massart learner for unions of high-dimensional rectangles.
Learning geometric concepts with nasty noise
TLDR
The first polynomial-time PAC learning algorithms for low-degree PTFs and intersections of halfspaces with dimension-independent error guarantees in the presence of nasty noise under the Gaussian distribution are given.
A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise
TLDR
The first polynomial-time certificate algorithm for PAC learning homogeneous halfspaces in the presence of Tsybakov noise is given, which learns the true halfspace within any desired accuracy $\epsilon$ and succeeds under a broad family of well-behaved distributions including log-concave distributions.
Robust Estimators in High Dimensions without the Computational Intractability
TLDR
This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.
Efficient Learning of Linear Separators under Bounded Noise
TLDR
This work provides the first evidence that one can indeed design algorithms achieving arbitrarily small excess error in polynomial time under this realistic noise model and thus opens up a new and exciting line of research.
Embedding Hard Learning Problems into Gaussian Space
TLDR
The first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution is given, showing the inherent diculty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions.
Efficient noise-tolerant learning from statistical queries
TLDR
This paper formalizes a new but related model of learning from statistical queries, and demonstrates the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learning in the presence of noise).
...
1
2
3
4
5
...