Cameron Musco

Learn More
Since being analyzed by Rokhlin, Szlam, and Tygert [1] and popularized by Halko, Martinsson, and Tropp [2], randomized Simultaneous Power Iteration has become the method of choice for approximate singular value decomposition. It is more accurate than simpler sketching algorithms, yet still converges quickly for any matrix, independently of singular value(More)
In this paper we provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix A>A. In particular we give the following results for computing an approximate eigenvector i.e. some x such that x>A>Ax ≥ (1− )λ1(AA): • Offline Eigenvector Estimation: Given an explicit matrix A ∈ Rn×d, we show how to compute an(More)
We show how to approximate a data matrix A with a much smaller sketch ~A that can be used to solve a general class of constrained k-rank approximation problems to within (1+ε) error. Importantly, this class includes k-means clustering and unconstrained low rank approximation (i.e. principal component analysis). By reducing data points to just O(k)(More)
Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time significantly. For theoretical performance guarantees, each row must be sampled with probability proportional to its(More)
Since being analyzed by Rokhlin, Szlam, and Tygert [1] and popularized by Halko, Martinsson, and Tropp [2], randomized Simultaneous Power Iteration has become the method of choice for approximate singular value decomposition. It is more accurate than simpler sketching algorithms, yet still converges quickly for any matrix, independently of singular value(More)
We show how to efficiently project a vector onto the top principal components of a matrix, without explicitly computing these components. Specifically, we introduce an iterative algorithm that provably computes the projection using few calls to any black-box routine for ridge regression. By avoiding explicit principal component analysis (PCA), our algorithm(More)
We present the first single pass algorithm for computing spectral sparsifiers of graphs in the dynamic semi-streaming model. Given a single pass over a stream containing insertions and deletions of edges to a graph, G, our algorithm maintains a randomized linear sketch of the incidence matrix into dimension O(1/&#x2208;<sup>2n</sup>polylog(n)). Using this(More)
We give the first algorithm for kernel Nyström approximation that runs in linear time in the number of training points and is provably accurate for all kernel matrices, without dependence on regularity or incoherence conditions. The algorithm projects the kernel onto a set of s landmark points sampled by their ridge leverage scores, requiring just O(ns)(More)
We give faster algorithms and improved sample complexities for the fundamental problem of estimating the top eigenvector. Given an explicit matrix A ∈ Rn×d, we show how to compute an approximate top eigenvector of A>A in time Õ ([ nnz(A) + d sr(A) gap2 ] · log 1/ ) . Here nnz(A) is the number of nonzeros in A, sr(A) is the stable rank, and gap is the(More)
Finding a small spectral approximation for a tall n×d matrix A is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of A. Row sampling improves interpretability, saves space when A is sparse, and preserves row structure, which is especially important, for example, when A represents(More)