• Corpus ID: 11628574

Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA

@article{Wang2017ScalingLE,
  title={Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA},
  author={Chuang Wang and Jonathan C. Mattingly and Yue M. Lu},
  journal={ArXiv},
  year={2017},
  volume={abs/1712.04332}
}
We present a framework for analyzing the exact dynamics of a class of online learning algorithms in the high-dimensional scaling limit. Our results are applied to two concrete examples: online regularized linear regression and principal component analysis. As the ambient dimension tends to infinity, and with proper time scaling, we show that the time-varying joint empirical measures of the target feature vector and its estimates provided by the algorithms will converge weakly to a deterministic… 
The Scaling Limit of High-Dimensional Online Independent Component Analysis
TLDR
In the high-dimensional limit, the original coupled dynamics associated with the algorithm will be asymptotically "decoupled", with each coordinate independently solving a 1-D effective minimization problem via stochastic gradient descent.
Online Power Iteration For Subspace Estimation Under Incomplete Observations: Limiting Dynamics And Phase Transitions
  • Hong Hu, Yue M. Lu
  • Mathematics, Computer Science
    2018 IEEE Statistical Signal Processing Workshop (SSP)
  • 2018
TLDR
This work shows that the dynamic performance of the imputation-based online power iteration method can be fully characterized by a finite-dimensional deterministic matrix recursion process, which provides an exact characterization of the relationship between estimation accuracy, sample complexity, and subsampling ratios.
Online stochastic gradient descent on non-convex losses from high-dimensional inference
TLDR
The approach is illustrated by applying it to a wide set of inference tasks such as phase retrieval, parameter estimation for generalized linear models, spiked matrix models, and spiked tensor models, as well as for supervised learning for single-layer networks with general activation functions.
A classification for the performance of online SGD for high-dimensional inference
TLDR
This work investigates the performance of the simplest version of SGD at attaining a "better than random" correlation with the unknown parameter, i.e, achieving weak recovery, and classification of the difficulty of typical instances of this task for online SGD in terms of the number of samples required as the dimension diverges.
Subspace Estimation From Incomplete Observations: A High-Dimensional Analysis
We present a high-dimensional analysis of three popular algorithms, namely, Oja's method, GROUSE, and PETRELS, for subspace estimation from streaming and highly incomplete observations. We show that,
A Solvable High-Dimensional Model of GAN
TLDR
It is proved that the macroscopic quantities measuring the quality of the training process converge to a deterministic process characterized by an ordinary differential equation (ODE), whereas the microscopic states containing all the detailed weights remain stochastic.
A Mean-Field Theory for Learning the Schönberg Measure of Radial Basis Functions
TLDR
A projected particle Langevin optimization method to learn the distribution in the Schonberg integral representation of the radial basis functions from training samples is developed and analyzed, and the existence and uniqueness of the steady-state solutions of the derived PDE in the weak sense are established.
Streaming PCA and Subspace Tracking: The Missing Data Case
TLDR
It is illustrated that streaming PCA and subspace tracking algorithms can be understood through algebraic and geometric perspectives, and they need to be adjusted carefully to handle missing data.
Mean Field Analysis of Deep Neural Networks
We analyze multi-layer neural networks in the asymptotic regime of simultaneously (A) large network sizes and (B) large numbers of stochastic gradient descent training iterations. We rigorously
A Mean-Field Theory for Kernel Alignment with Random Features in Generative Adversarial Networks
TLDR
A novel supervised learning method to optimize the kernel in maximum mean discrepancy generative adversarial networks (MMD GANs) with kernel learning attains higher inception scores well as Frechet inception distances and generates better images compared to the generative moment matching network (GMMN) and MMD GAN with untrained kernels.
...
1
2
...

References

SHOWING 1-10 OF 68 REFERENCES
The Scaling Limit of High-Dimensional Online Independent Component Analysis
TLDR
In the high-dimensional limit, the original coupled dynamics associated with the algorithm will be asymptotically "decoupled", with each coordinate independently solving a 1-D effective minimization problem via stochastic gradient descent.
Online learning for sparse PCA in high dimensions: Exact dynamics and phase transitions
TLDR
In the high-dimensional limit, the joint empirical measure of the underlying sparse eigenvector and its estimate provided by the algorithm is shown to converge weakly to a deterministic, measure-valued process.
Sharp Time–Data Tradeoffs for Linear Inverse Problems
TLDR
The results demonstrate that a linear convergence rate is attainable even though the least squares objective is not strongly convex in these settings, and present a unified convergence analysis of the gradient projection algorithm applied to such problems.
The Dynamics of Message Passing on Dense Graphs, with Applications to Compressed Sensing
TLDR
This paper proves that indeed it holds asymptotically in the large system limit for sensing matrices with independent and identically distributed Gaussian entries, and provides rigorous foundation to state evolution.
Asymptotic Analysis of MAP Estimation via the Replica Method and Applications to Compressed Sensing
TLDR
It is shown that with random linear measurements and Gaussian noise, the replica-symmetric prediction of the asymptotic behavior of the postulated MAP estimate of an -dimensional vector “decouples” as scalar postulatedMAP estimators is shown to be correct.
A Direct Formulation for Sparse PCA Using Semidefinite Programming
TLDR
A modification of the classical variational representation of the largest eigenvalue of a symmetric matrix is used, where cardinality is constrained, and a semidefinite programming-based relaxation is derived for the sparse PCA problem.
Vector approximate message passing
TLDR
This paper considers a “vector AMP” (VAMP) algorithm and shows that VAMP has a rigorous scalar state-evolution that holds under a much broader class of large random matrices A: those that are right-rotationally invariant.
Robust Stochastic Approximation Approach to Stochastic Programming
TLDR
It is intended to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems.
Phase transitions in semidefinite relaxations
TLDR
Asymptotic predictions for several detection thresholds are developed, as well as for the estimation error above these thresholds, to clarify the effectiveness of SDP relaxations in solving high-dimensional statistical problems.
Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization
TLDR
This paper investigates the optimality of SGD in a stochastic setting, and shows that for smooth problems, the algorithm attains the optimal O(1/T) rate, however, for non-smooth problems the convergence rate with averaging might really be Ω(log(T)/T), and this is not just an artifact of the analysis.
...
1
2
3
4
5
...