# Agnostic Proper Learning of Halfspaces under Gaussian Marginals

@inproceedings{Diakonikolas2021AgnosticPL, title={Agnostic Proper Learning of Halfspaces under Gaussian Marginals}, author={Ilias Diakonikolas and Daniel M. Kane and Vasilis Kontonis and Christos Tzamos and Nikos Zarifis}, booktitle={Annual Conference Computational Learning Theory}, year={2021} }

We study the problem of agnostically learning halfspaces under the Gaussian distribution. Our main result is the first proper learning algorithm for this problem whose sample complexity and computational complexity qualitatively match those of the best known improper agnostic learner. Building on this result, we also obtain the first proper polynomial-time approximation scheme (PTAS) for agnostically learning homogeneous halfspaces. Our techniques naturally extend to agnostically learning…

## 4 Citations

### Testing distributional assumptions of learning algorithms

- Computer ScienceArXiv
- 2022

A model by which to systematically study the design of tester-learner pairs is proposed, such that if the distribution on examples in the data passes the tester T then one can safely trust the output of the agnostic learner A on the data.

### Understanding Simultaneous Train and Test Robustness

- Computer ScienceALT
- 2022

This work shows that the two seemingly different notions of robustness at train-time and test-time are closely related, and this connection can be leveraged to develop algorithmic techniques that are applicable in both the settings.

### Approximate Maximum Halfspace Discrepancy

- Mathematics, Computer ScienceISAAC
- 2021

A key technical result is a ε-approximate halfspace range counting data structure of size O(1/ε) with O(log( 1/ε)) query time, which can build in O(|X| + (1/ ε) log4(1 /ε)) time.

### Learning general halfspaces with general Massart noise under the Gaussian distribution

- Computer Science, MathematicsSTOC
- 2022

The techniques rely on determining the existence (or non-existence) of low-degree polynomials whose expectations distinguish Massart halfspaces from random noise, and establish a qualitatively matching lower bound of dΩ(log(1/γ)) on the complexity of any Statistical Query (SQ) algorithm.

## References

SHOWING 1-10 OF 37 REFERENCES

### The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals

- Computer Science, MathematicsCOLT
- 2021

It is shown that the L-polynomial regression algorithm is essentially best possible among SQ algorithms, and therefore that the SQ complexity of agnostic learning is closely related to the polynomial degree required to approximate any function from the concept class in L-norm.

### Agnostically learning halfspaces

- Computer Science46th Annual IEEE Symposium on Foundations of Computer Science (FOCS'05)
- 2005

We give the first algorithm that (under distributional assumptions) efficiently learns halfspaces in the notoriously difficult agnostic framework of Kearns, Schapire, & Sellie, where a learner is…

### Non-Convex SGD Learns Halfspaces with Adversarial Label Noise

- Computer ScienceNeurIPS
- 2020

For a broad family of structured distributions, including log-concave distributions, it is shown that non-convex SGD efficiently converges to a solution with misclassification error $O(\opt)+\eps$, where $\opt$ is the mis classification error of the best-fitting halfspace.

### Toward efficient agnostic learning

- Computer ScienceMachine Learning
- 2004

An investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions is initiated, providing an initial outline of the possibilities for agnostic learning.

### Learning geometric concepts with nasty noise

- Computer ScienceSTOC
- 2018

The first polynomial-time PAC learning algorithms for low-degree PTFs and intersections of halfspaces with dimension-independent error guarantees in the presence of nasty noise under the Gaussian distribution are given.

### New Results for Learning Noisy Parities and Halfspaces

- Computer Science2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
- 2006

The first nontrivial algorithm for learning parities with adversarial noise is given, which shows that learning of DNF expressions reduces to learning noisy parities of just logarithmic number of variables and that majorities of halfspaces are hard to PAC-learn using any representation.

### Complexity theoretic limitations on learning halfspaces

- Computer Science, MathematicsSTOC
- 2016

It is shown that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that Err_H(D) <= eta for arbitrarily small constant eta>0, and that D is supported in the Boolean cube.

### Hardness of Learning Halfspaces with Noise

- Computer Science, Mathematics2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
- 2006

It is proved that even a tiny amount of worst-case noise makes the problem of learning halfspaces intractable in a strong sense, and a strong hardness is obtained for another basic computational problem: solving a linear system over the rationals.

### Approximation Schemes for ReLU Regression

- Computer Science, MathematicsCOLT
- 2020

The main insight is a new characterization of surrogate losses for nonconvex activations, showing that properties of the underlying distribution actually induce strong convexity for the loss, allowing us to relate the global minimum to the activation's Chow parameters.

### Learning Halfspaces with Malicious Noise

- Computer ScienceICALP
- 2009

New algorithms for learning halfspaces in the challenging malicious noise model can tolerate malicious noise rates exponentially larger than previous work in terms of the dependence on the dimension n, and succeed for the fairly broad class of all isotropic log-concave distributions.