Corpus ID: 15134950

Minimax Subsampling for Estimation and Prediction in Low-Dimensional Linear Regression

  title={Minimax Subsampling for Estimation and Prediction in Low-Dimensional Linear Regression},
  author={Yining Wang and Aarti Singh},
Subsampling strategies are derived to sample a small portion of design (data) points in a low-dimensional linear regression model $y=X\beta+\varepsilon$ with near-optimal statistical rates. Our results apply to both problems of estimation of the underlying linear model $\beta$ and predicting the real-valued response $y$ of a new data point $x$. The derived subsampling strategies are minimax optimal under the fixed design setting, up to a small $(1+\epsilon)$ relative factor. We also give… Expand
Minimax Linear Regression under Measurement Constraints
We consider the problem of linear regression under measurement constraints and derive computationally feasible subsampling strategies to sample a small portion of design (data) points in a linearExpand
Error Analysis of Generalized Nyström Kernel Regression
The generalized Nystr\"{o}m kernel regression (GNKR) with $\ell_2$ coefficient regularization is considered, where the kernel just requires the continuity and boundedness and the fast learning rate with polynomial decay is reached for the GNKR. Expand


Optimal Subsampling Approaches for Large Sample Linear Regression
A significant hurdle for analyzing large sample data is the lack of effective statistical computing and inference methods. An emerging powerful approach for analyzing large sample data isExpand
Regression Shrinkage and Selection via the Lasso
SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than aExpand
Random Design Analysis of Ridge Regression
This work gives a simultaneous analysis of both the ordinary least squares estimator and the ridge regression estimator in the random design setting under mild assumptions on the covariate/responseExpand
Active Regression by Stratification
This is the first active learner for this setting that provably can improve over passive learning and provides finite sample convergence guarantees for general distributions in the misspecified model. Expand
A statistical perspective on algorithmic leveraging
This work provides an effective framework to evaluate the statistical properties of algorithmic leveraging in the context of estimating parameters in a linear regression model and shows that from the statistical perspective of bias and variance, neither leverage-based sampling nor uniform sampling dominates the other. Expand
Fast Randomized Kernel Ridge Regression with Statistical Guarantees
A version of this approach that comes with running time guarantees as well as improved guarantees on its statistical performance is described, and a fast algorithm is presented to quickly compute coarse approximations to these scores in time linear in the number of samples. Expand
On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA
This work provides a unified analysis of the properties of the sample covariance matrix $\Sigma_n$ over the class of $p\times p$ population covariance matrices $\Sigma$ of reduced effective rankExpand
Fast Relative-Error Approximation Algorithm for Ridge Regression
To the best of the knowledge, this is the first algorithm for ridge regression that runs in o(n2p) time with provable relative-error approximation bound on the output vector and shows empirical results on both synthetic and real datasets. Expand
Relative-Error CUR Matrix Decompositions
These two algorithms are the first polynomial time algorithms for such low-rank matrix approximations that come with relative-error guarantees; previously, in some cases, it was not even known whether such matrix decompositions exist. Expand
Faster least squares approximation
This work presents two randomized algorithms that provide accurate relative-error approximations to the optimal value and the solution vector of a least squares approximation problem more rapidly than existing exact algorithms. Expand