Universality laws for randomized dimension reduction, with applications

  title={Universality laws for randomized dimension reduction, with applications},
  author={Samet Oymak and Joel A. Tropp},
Dimension reduction is the process of embedding high-dimensional data into a lower dimensional space to facilitate its analysis. In the Euclidean setting, one fundamental technique for dimension reduction is to apply a random linear map to the data. This dimension reduction procedure succeeds when it preserves certain geometric features of the set. The question is how large the embedding dimension must be to ensure that randomized dimension reduction succeeds with high probability. This paper… 
Random projections of random manifolds
This work finds explicitly computable, approximate theoretical bounds on how many projections are needed to accurately preserve the geometry of smooth Gaussian random manifolds, which can only be violated with a probability that is exponentially small in the ambient dimension.
Non-Gaussian Random Matrices on Sets:Optimal Tail Dependence and Applications
The optimal tail dependency on the sub-gaussian parameter is presented and proved through a new version of Bernstein's in-equality, and popular applications whose theoretical guarantees can be improved by the results are illustrated.
Bound-constrained global optimization of functions with low effective dimensionality using multiple random embeddings
Using the success probability of the reduced subproblems, it is proved that X-REGO converges globally, with probability one, and linearly in the number of embeddings, to an constrained global minimizer.
Spectral Properties of Heavy-Tailed Random Matrices
The classical Random Matrix Theory studies asymptotic spectral properties of random matrices when their dimensions grow to infinity. In contrast, the non-asymptotic branch of the theory is focused on
High-Dimensional Probability
A broad range of illustrations is embedded throughout, including classical and modern results for covariance estimation, clustering, networks, semidefinite programming, coding, dimension reduction, matrix completion, machine learning, compressed sensing, and sparse regression.
CS 6220: DATA-SPARSE MATRIX COMPUTATIONS Lecture 7: Low-dimensional embeddings
  • Computer Science
  • 2020
Low-dimensional embeddings are introduced as a method of dimensionality reduction and it is seen that such embedDings are cheap to compute using randomness while preserving geometric attributes of the input data.
A New Theory for Sketching in Linear Regression
  • Edgar Dobriban, Sifan Liu
  • Computer Science, Mathematics
  • 2018
This work studies the statistical performance of sketching algorithms for linear regression using asymptotic random matrix theory and free probability theory, and finds precise and simple expressions for the accuracy loss of these methods.
Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in High Dimensions
This paper characterize for the first time the fundamental limits on the statistical accuracy of convex ERM for inference in high-dimensional generalized linear models and derives tight lower bounds on the estimation and prediction error that hold over a wide class of loss functions and for any value of the regularization parameter.
Universality Laws for High-Dimensional Learning with Random Features
We prove a universality theorem for learning with random features. Our result shows that, in terms of training and generalization errors, the random feature model with a nonlinear activation function
Sub‐Gaussian Matrices on Sets: Optimal Tail Dependence and Applications
Random linear mappings are widely used in modern signal processing, compressed sensing, and machine learning. These mappings may be used to embed the data into a significantly lower dimension while


Randomized sketches of convex programs with sharp guarantees
This work analyzes RP-based approximations of convex programs, in which the original optimization problem is approximated by the solution of a lower-dimensional problem, and proves that the approximation ratio of this procedure can be bounded in terms of the geometry of constraint set.
Concentration Inequalities - A Nonasymptotic Theory of Independence
Deep connections with isoperimetric problems are revealed whilst special attention is paid to applications to the supremum of empirical processes.
Toward a Unified Theory of Sparse Dimensionality Reduction in Euclidean Space
This work qualitatively unify several results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and Fourier-based restricted isometries and introduces a new complexity parameter, which depends on the geometry of T, and shows that it suffices to choose s and m such that this parameter is small.
Noise stability of functions with low influences: Invariance and optimality
An invariance principle for multilinear polynomials with low influences and bounded degree is proved; it shows that under mild conditions the distribution of such polynmials is essentially invariant for all product spaces.
Living on the edge: phase transitions in convex programs with random data
This paper provides the first rigorous analysis that explains why phase transitions are ubiquitous in random convex optimization problems and introduces a summary parameter, called the statistical dimension, that canonically extends the dimension of a linear subspace to the class of convex cones.
Random matrices: Universality of local spectral statistics of non-Hermitian matrices
This paper shows that a real n×n matrix whose entries are jointly independent, exponentially decaying and whose moments match the real Gaussian ensemble to fourth order has 2nπ−−√+o(n√) real eigenvalues asymptotically almost surely.
Anisotropic local laws for random matrices
We develop a new method for deriving local laws for a large class of random matrices. It is applicable to many matrix models built from sums and products of deterministic or independent random
Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing
  • D. Donoho, J. Tanner
  • Computer Science
    Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
  • 2009
An extensive computational experiment and formal inferential analysis is conducted to test the hypothesis that phase transitions occurring in modern high-dimensional data analysis and signal processing are universal across a range of underlying matrix ensembles, and shows that finite-sample universality can be rejected.
Counting faces of randomly-projected polytopes when the projection radically lowers dimension
This paper develops asymptotic methods to count faces of random high-dimensional polytopes; a seemingly dry and unpromising pursuit that has surprising implications in statistics, probability, information theory, and signal processing with potential impacts in practical subjects like medical imaging and digital communications.
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.