Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models

  title={Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models},
  author={Ilias Diakonikolas and Daniel M. Kane and Yuxin Sun},
We establish optimal Statistical Query (SQ) lower bounds for robustly learning certain families of discrete high-dimensional distributions. In particular, we show that no efficient SQ algorithm with access to an (cid:15) -corrupted binary product distribution can learn its mean within (cid:96) 2 -error o ( (cid:15) (cid:112) log(1 /(cid:15) )) . Similarly, we show that no efficient SQ algorithm with access to an (cid:15) -corrupted ferromagnetic high-temperature Ising model can learn the model to… 
1 Citations

SQ Lower Bounds for Learning Single Neurons with Massart Noise

A novel SQ-hard construction for learning {± 1 } -weight Massart halfspaces on the Boolean hypercube that is interesting on its own right is constructed.



Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise

It is shown that no efficient SQ algorithm for learning Massart halfspaces on R d can achieve error better than Ω( η ) , even if OPT = 2 − log c ( d ) , for any universal constant c ∈ (0, 1) .

Robust Estimators in High Dimensions without the Computational Intractability

This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.

The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals

It is shown that the L-polynomial regression algorithm is essentially best possible among SQ algorithms, and therefore that the SQ complexity of agnostic learning is closely related to the polynomial degree required to approximate any function from the concept class in L-norm.

Statistical Query Lower Bounds for List-Decodable Linear Regression

The main result is a Statistical Query (SQ) lower bound of d, which qualitatively matches the performance of previously developed algorithms, providing evidence that current upper bounds for this task are nearly best possible.

Statistical Algorithms and a Lower Bound for Detecting Planted Cliques

The main application is a nearly optimal lower bound on the complexity of any statistical query algorithm for detecting planted bipartite clique distributions when the planted clique has size O(n1/2 − δ) for any constant δ > 0.

Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks

The first polynomial-time algorithm for this learning problem for PAC learning one-hidden-layer ReLU networks with arbitrary real coefficients is given, and a Statistical Query lower bound of d^{\Omega(k)$ is proved.

Statistical Query Algorithms for Mean Vector Estimation and Stochastic Convex Optimization

It is shown that well-known and popular first-order iterative methods can be implemented using only statistical queries and derive nearly matching upper and lower bounds on the estimation (sample) complexity, including linear optimization in the most general setting.

Efficient noise-tolerant learning from statistical queries

This paper formalizes a new but related model of learning from statistical queries, and demonstrates the generality of the statistical query model, showing that practically every class learnable in Valiant’s model and its variants can also be learned in the new model (and thus can be learning in the presence of noise).

On the Complexity of Random Satisfiability Problems with Planted Solutions

An unconditional lower bound, tight up to logarithmic factors, of Ω(nr/2) clauses for statistical algorithms, matching the known upper bound (which, as it is shown, can be implemented using a statistical algorithm).

Noise-tolerant learning, the parity problem, and the statistical query model

The algorithm runs in polynomial time for the case of parity functions that depend on only the first O(log n log log n) bits of input, which provides the first known instance of an efficient noise-tolerant algorithm for a concept class that is not learnable in the Statistical Query model of Kearns [1998].