Efficient Learning of Linear Separators under Bounded Noise
@inproceedings{Awasthi2015EfficientLO, title={Efficient Learning of Linear Separators under Bounded Noise}, author={Pranjal Awasthi and Maria-Florina Balcan and Nika Haghtalab and Ruth Urner}, booktitle={COLT}, year={2015} }
We study the learnability of linear separators in $\Re^d$ in the presence of bounded (a.k.a Massart) noise. This is a realistic generalization of the random classification noise model, where the adversary can flip each example $x$ with probability $\eta(x) \leq \eta$. We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in $\Re^d$, for some constant value of $\eta…
68 Citations
Hardness of Learning Halfspaces with Massart Noise
- Computer ScienceArXiv
- 2020
There is an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a polynomial-time SQ algorithm, and this lower bound implies that no efficient SQ algorithm can approximate the optimal error within anyPolynomial factor.
The Power of Localization for Efficiently Learning Linear Separators with Noise
- Computer ScienceJ. ACM
- 2017
This work provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise, and achieves a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm).
Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance
- Computer ScienceALT
- 2021
The main techniques include attribute-efficient paradigms for instance reweighting and for empirical risk minimization, and a new analysis of uniform concentration for unbounded data -- all of them crucially take the structure of the underlying halfspace into account.
Learning Halfspaces with Tsybakov Noise
- Computer ScienceArXiv
- 2020
This work uses a novel computationally efficient procedure to certify whether a candidate solution is near-optimal, based on semi-definite programming, as a black-box and turns it into an efficient learning algorithm by searching over the space of halfspaces via online convex optimization.
Robust Learning under Strong Noise via SQs
- Computer Science, MathematicsAISTATS
- 2021
This work provides several new insights on the robustness of Kearns' statistical query framework against challenging label-noise models, and shows that every SQ learnable class admits an efficient learning algorithm with OPT + $\epsilon$ misclassification error for a broad class of noise models.
Efficient active learning of sparse halfspaces with arbitrary bounded noise
- Computer ScienceNeurIPS
- 2020
This work substantially improves on the existing polynomial time algorithm for active learning of homogeneous-sparse halfspaces under bounded noise and isotropic log-concave distributions, with a label complexity of $\tilde{\mathcal{O}}\Big(\frac{s}{(1-2\eta)^4} \mathrm{polylog} (d, \frac 1 \epsilon) \Big)$.
A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise
- Computer ScienceArXiv
- 2020
The first polynomial-time certificate algorithm for PAC learning homogeneous halfspaces in the presence of Tsybakov noise is given, which learns the true halfspace within any desired accuracy $\epsilon$ and succeeds under a broad family of well-behaved distributions including log-concave distributions.
Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise
- Computer Science
- 2020
It is shown that no efficient SQ algorithm for learning Massart halfspaces on R can achieve error better than Ω(η), even if OPT = 2− log c(d), for any universal constant c ∈ (0, 1).
Efficient active learning of sparse halfspaces
- Computer ScienceCOLT
- 2018
This paper provides a computationally efficient algorithm that achieves a label complexity of O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon)) under certain distributional assumptions on the data.
Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise
- Computer ScienceArXiv
- 2022
Tight statistical query lower bounds for learnining halfspaces in the presence of Massart noise are given and it is shown that for arbitrary ∈ [0, 1/2] every SQ algorithm achieving misclassification error better than requires queries of super polynomial accuracy or at least a superpolynomial number of queries.
References
SHOWING 1-10 OF 44 REFERENCES
The Power of Localization for Efficiently Learning Linear Separators with Noise
- Computer ScienceJ. ACM
- 2017
This work provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise, and achieves a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm).
A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions
- Computer Science, MathematicsAlgorithmica
- 1998
It is shown how simple greedy methods can be used to find weak hypotheses (hypotheses that correctly classify noticeably more than half of the examples) in polynomial time, without dependence on any separation parameter.
Efficient Learning of Linear Perceptrons
- Computer ScienceNIPS
- 2000
It is proved that unless P=NP, there is no algorithm that runs in time polynomial in the sample size and in 1/µ that is µ-margin successful for all µ > 0.
Hardness of Learning Halfspaces with Noise
- Computer Science, Mathematics2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
- 2006
It is proved that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense, and a strong hardness is obtained for another basic computational problem: solving a linear system over the rationals.
Embedding Hard Learning Problems into Gaussian Space
- Computer ScienceElectron. Colloquium Comput. Complex.
- 2014
The first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution is given, showing the inherent diculty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions.
Agnostically learning halfspaces
- Computer Science46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05)
- 2005
We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is…
Learning Kernel-Based Halfspaces with the 0-1 Loss
- Computer ScienceSIAM J. Comput.
- 2011
A new algorithm for agnostically learning kernel-based halfspaces with respect to the 0-1 loss function is described and analyzed and proves a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn kernel- based halfspace in time polynomial in $L$.
Efficient algorithms in computational learning theory
- Computer Science
- 2001
This thesis gives the first proof that attribute efficient learning (a type of learning from very few examples) can be computationally hard and gives a optimal characterization of Disjunctive Normal Form formulae as thresholded real-valued polynomials.
Learning Halfspaces with Malicious Noise
- Computer ScienceICALP
- 2009
New algorithms for learning halfspaces in the challenging malicious noise model can tolerate malicious noise rates exponentially larger than previous work in terms of the dependence on the dimension n, and succeed for the fairly broad class of all isotropic log-concave distributions.
Statistical Active Learning Algorithms
- Computer ScienceNIPS
- 2013
It is shown that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of "uncorrelated" noise, leading to the first differentially-private active learning algorithms with exponential label savings over the passive case.