• Corpus ID: 6077315

Efficient Learning of Linear Separators under Bounded Noise

@inproceedings{Awasthi2015EfficientLO,
  title={Efficient Learning of Linear Separators under Bounded Noise},
  author={Pranjal Awasthi and Maria-Florina Balcan and Nika Haghtalab and Ruth Urner},
  booktitle={COLT},
  year={2015}
}
We study the learnability of linear separators in $\Re^d$ in the presence of bounded (a.k.a Massart) noise. This is a realistic generalization of the random classification noise model, where the adversary can flip each example $x$ with probability $\eta(x) \leq \eta$. We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in $\Re^d$, for some constant value of $\eta… 
Hardness of Learning Halfspaces with Massart Noise
TLDR
There is an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a polynomial-time SQ algorithm, and this lower bound implies that no efficient SQ algorithm can approximate the optimal error within anyPolynomial factor.
The Power of Localization for Efficiently Learning Linear Separators with Noise
TLDR
This work provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise, and achieves a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm).
Attribute-Efficient Learning of Halfspaces with Malicious Noise: Near-Optimal Label Complexity and Noise Tolerance
TLDR
The main techniques include attribute-efficient paradigms for instance reweighting and for empirical risk minimization, and a new analysis of uniform concentration for unbounded data -- all of them crucially take the structure of the underlying halfspace into account.
Learning Halfspaces with Tsybakov Noise
TLDR
This work uses a novel computationally efficient procedure to certify whether a candidate solution is near-optimal, based on semi-definite programming, as a black-box and turns it into an efficient learning algorithm by searching over the space of halfspaces via online convex optimization.
Robust Learning under Strong Noise via SQs
TLDR
This work provides several new insights on the robustness of Kearns' statistical query framework against challenging label-noise models, and shows that every SQ learnable class admits an efficient learning algorithm with OPT + $\epsilon$ misclassification error for a broad class of noise models.
Efficient active learning of sparse halfspaces with arbitrary bounded noise
TLDR
This work substantially improves on the existing polynomial time algorithm for active learning of homogeneous-sparse halfspaces under bounded noise and isotropic log-concave distributions, with a label complexity of $\tilde{\mathcal{O}}\Big(\frac{s}{(1-2\eta)^4} \mathrm{polylog} (d, \frac 1 \epsilon) \Big)$.
A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise
TLDR
The first polynomial-time certificate algorithm for PAC learning homogeneous halfspaces in the presence of Tsybakov noise is given, which learns the true halfspace within any desired accuracy $\epsilon$ and succeeds under a broad family of well-behaved distributions including log-concave distributions.
Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise
TLDR
It is shown that no efficient SQ algorithm for learning Massart halfspaces on R can achieve error better than Ω(η), even if OPT = 2− log c(d), for any universal constant c ∈ (0, 1).
Efficient active learning of sparse halfspaces
TLDR
This paper provides a computationally efficient algorithm that achieves a label complexity of O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon)) under certain distributional assumptions on the data.
Optimal SQ Lower Bounds for Learning Halfspaces with Massart Noise
TLDR
Tight statistical query lower bounds for learnining halfspaces in the presence of Massart noise are given and it is shown that for arbitrary ∈ [0, 1/2] every SQ algorithm achieving misclassification error better than requires queries of super polynomial accuracy or at least a superpolynomial number of queries.
...
...

References

SHOWING 1-10 OF 44 REFERENCES
The Power of Localization for Efficiently Learning Linear Separators with Noise
TLDR
This work provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise, and achieves a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm).
A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions
TLDR
It is shown how simple greedy methods can be used to find weak hypotheses (hypotheses that correctly classify noticeably more than half of the examples) in polynomial time, without dependence on any separation parameter.
Efficient Learning of Linear Perceptrons
TLDR
It is proved that unless P=NP, there is no algorithm that runs in time polynomial in the sample size and in 1/µ that is µ-margin successful for all µ > 0.
Hardness of Learning Halfspaces with Noise
  • V. Guruswami, P. Raghavendra
  • Computer Science, Mathematics
    2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
  • 2006
TLDR
It is proved that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense, and a strong hardness is obtained for another basic computational problem: solving a linear system over the rationals.
Embedding Hard Learning Problems into Gaussian Space
TLDR
The first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution is given, showing the inherent diculty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions.
Agnostically learning halfspaces
We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is
Learning Kernel-Based Halfspaces with the 0-1 Loss
TLDR
A new algorithm for agnostically learning kernel-based halfspaces with respect to the 0-1 loss function is described and analyzed and proves a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn kernel- based halfspace in time polynomial in $L$.
Efficient algorithms in computational learning theory
TLDR
This thesis gives the first proof that attribute efficient learning (a type of learning from very few examples) can be computationally hard and gives a optimal characterization of Disjunctive Normal Form formulae as thresholded real-valued polynomials.
Learning Halfspaces with Malicious Noise
TLDR
New algorithms for learning halfspaces in the challenging malicious noise model can tolerate malicious noise rates exponentially larger than previous work in terms of the dependence on the dimension n, and succeed for the fairly broad class of all isotropic log-concave distributions.
Statistical Active Learning Algorithms
TLDR
It is shown that any efficient active statistical learning algorithm can be automatically converted to an efficient active learning algorithm which is tolerant to random classification noise as well as other forms of "uncorrelated" noise, leading to the first differentially-private active learning algorithms with exponential label savings over the passive case.
...
...