Corpus ID: 60440600

Space lower bounds for linear prediction

  title={Space lower bounds for linear prediction},
  author={Yuval Dagan and Gil Kur and Ohad Shamir},
We show that fundamental learning tasks, such as finding an approximate linear separator or linear regression, require memory at least \emph{quadratic} in the dimension, in a natural streaming setting. This implies that such problems cannot be solved (at least in this setting) by scalable memory-efficient streaming algorithms. Our results build on a memory lower bound for a simple linear-algebraic problem -- finding orthogonal vectors -- and utilize the estimates on the packing of the… Expand
Distributed Learning with Sublinear Communication
By slightly relaxing the standard boundedness assumptions for linear models, a family of algorithms that combine mirror descent with randomized sparsification/quantization of iterates are obtained and this result extends to the general stochastic convex optimization model. Expand
L G ] 1 8 M ar 2 01 9 Distributed Learning with Sublinear Communication
In distributed statistical learning, N samples are split across m machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine. This modelExpand
Convex Set Disjointness, Distributed Learning of Halfspaces, and LP Feasibility
These protocols are based on a Container Lemma for Halfspaces and on two variants of Caratheodory's Theorem, which may be of independent interest, and derive upper and lower bounds of $\tilde O(d^2\log n)$ and~$Omega(d\ log n)$. Expand
Intrinsic and Dual Volume Deviations of Convex Bodies and Polytopes
We establish estimates for the asymptotic best approximation of the Euclidean unit ball by polytopes under a notion of distance induced by the intrinsic volumes. We also introduce a notion ofExpand


Memory-sample tradeoffs for linear regression with small error
We consider the problem of performing linear regression over a stream of d-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate ofExpand
Optimal Approximate Matrix Product in Terms of Stable Rank
We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map havingExpand
Mixing Implies Lower Bounds for Space Bounded Learning
It is shown that if an hypothesis class H, when viewed as a bipartite graph between hypotheses H and labeled examples X, is mixing, then learning it requires |H| examples under a certain bound on the memory, which means that most hypothesis classes are unlearnable with bounded memory. Expand
Extractor-based time-space lower bounds for learning
This work shows that for a large class of learning problems, any learning algorithm requires either a memory of size at least Ω(k · l ), or at least 2Ω(r) samples, or an exponential number of samples, achieving a tight Ω((log|X|) · (log|A|)) lower bound on the size of the memory. Expand
A Time-Space Lower Bound for a Large Class of Learning Problems
  • R. Raz
  • Mathematics, Computer Science
  • 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
  • 2017
We prove a general time-space lower bound that applies for a large class of learning problems and shows that for every problem in that class, any learning algorithm requires either a memory ofExpand
Fast Learning Requires Good Memory: A Time-Space Lower Bound for Parity Learning
  • R. Raz
  • Computer Science, Mathematics
  • 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)
  • 2016
It is proved that any algorithm for learning parities requires either a memory of quadratic size or an exponential number of samples, and an encryption scheme that requires a private key of length n, as well as time complexity of n per encryption/decryption of each bit is provenly and unconditionally secure. Expand
Time-Space Tradeoffs for Learning from Small Test Spaces: Learning Low Degree Polynomial Functions
Any algorithm that learns $m-variate homogeneous polynomial functions of degree at most $d$ over $\mathbb{F}_2$ from evaluations on randomly chosen inputs either requires space $\Omega(mn)$ or $2^{\Omega (m)}$ time where $n=m^{\Theta(d)}$ is the dimension of the space of such functions. Expand
Time-Space Tradeoffs for Learning Finite Functions from Random Evaluations, with Applications to Polynomials
The time-space complexity of learning from random evaluations is reduced to the question of how much the corresponding evaluation matrix amplifies the 2-norms of “almost uniform” probability distributions, and bounds for learning polynomials over finite fields are derived. Expand
Robust Subspace Approximation in a Stream
This work gives the first sublinear approximation algorithm for this problem in the turnstile streaming and arbitrary partition distributed models, achieving the same time guarantees as in the offline case. Expand
Entropy Samplers and Strong Generic Lower Bounds For Space Bounded Learning
It is proved that any hypothesis class whose hypotheses graph is mixing cannot be learned using less than Omega(log^2 |H|) memory bits unless the learner uses at least a large number of labeled examples. Expand