Robust Partially-Compressed Least-Squares

  title={Robust Partially-Compressed Least-Squares},
  author={Stephen Becker and Ban Kawas and Marek Petrik},
Randomized matrix compression techniques, such as the Johnson-Lindenstrauss transform, have emerged as an effective and practical way for solving large-scale problems efficiently. With a focus on computational efficiency, however, forsaking solutions quality and accuracy becomes the trade-off. In this paper, we investigate compressed least-squares problems and propose new models and algorithms that address the issue of error and noise introduced by compression. While maintaining… 

Figures and Tables from this paper

Error Estimation for Randomized Least-Squares Algorithms via the Bootstrap
A bootstrap method to compute a posteriori error estimates for randomized LS algorithms that permit the user to numerically assess the error of a given solution, and to predict how much work is needed to improve a "preliminary" solution.
Compressed and Penalized Linear Regression
This article provides the first efficient methods for tuning parameter selection, compares the methods with current approaches via simulation and application, and provides theoretical intuition which makes explicit the impact of approximation on statistical efficiency and demonstrates the necessity of careful parameter tuning.
Statistical properties of sketching algorithms
It is argued that the sketched data can be modelled as a random sample, thus placing this family of data compression methods firmly within an inferential framework and demonstrating the theory and the limits of its applicability on two datasets.
A robust approach to warped Gaussian process-constrained optimization
This work introduces a new class of constraints in which the same black-box function occurs multiple times evaluated at different domain points, and reformulates these uncertain constraints into deterministic constraints guaranteed to be satisfied with a specified probability.


Faster least squares approximation
This work presents two randomized algorithms that provide accurate relative-error approximations to the optimal value and the solution vector of a least squares approximation problem more rapidly than existing exact algorithms.
Robust Solutions to Least-Squares Problems with Uncertain Data
We consider least-squares problems where the coefficient matrices A,b are unknown but bounded. We minimize the worst-case residual error using (convex) second-order cone programming, yielding an
Randomized Algorithms for Matrices and Data
This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis.
Near-Optimal Coresets for Least-Squares Regression
Deterministic, low-order polynomial-time algorithms are given to construct such coresets with approximation guarantees, together with lower bounds indicating that there is not much room for improvement upon the results.
Blendenpik: Supercharging LAPACK's Least-Squares Solver
A least-squares solver for dense highly overdetermined systems that achieves residuals similar to those of direct QR factorization- based solvers, outperforms lapack by large factors, and scales significantly better than any QR-based solver.
Recovering the Optimal Solution by Dual Random Projection
The theoretical analysis shows that with a high probability, the proposed algorithm is able to accurately recover the optimal solution to the original problem, provided that the data matrix is of low rank or can be well approximated by a low rank matrix.
Sketching as a Tool for Numerical Linear Algebra
This survey highlights the recent advances in algorithms for numericallinear algebra that have come from the technique of linear sketching, and considers least squares as well as robust regression problems, low rank approximation, and graph sparsification.
SPIRAL: Code Generation for DSP Transforms
SPIRAL generates high-performance code for a broad set of DSP transforms, including the discrete Fourier transform, other trigonometric transforms, filter transforms, and discrete wavelet transforms.
Random Projections for $k$-means Clustering
It is proved that any set of n points in d dimensions can be projected into t = Ω(k/e2) dimensions, for any e ∈ (0, 1/3), in O(nd[e-2k/ log(d)]) time, such that with constant probability the optimal k-partition of the point set is preserved within a factor of 2 + e.
Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach
Empirical results show that the proposed approach achieves better and more robust clustering performance compared to not only single runs of random projection/clustering but also clustering with PCA, a traditional data reduction method for high dimensional data.