• Publications
  • Influence
A Practical Algorithm for Topic Modeling with Provable Guarantees
TLDR
This paper presents an algorithm for topic model inference that is both provable and practical and produces results comparable to the best MCMC implementations while running orders of magnitude faster.
Learning Topic Models -- Going beyond SVD
TLDR
This paper formally justifies Nonnegative Matrix Factorization (NMF) as a main tool in this context, which is an analog of SVD where all vectors are nonnegative, and gives the first polynomial-time algorithm for learning topic models without the above two limitations.
Computing a nonnegative matrix factorization -- provably
TLDR
This work gives an algorithm that runs in time polynomial in n, m and r under the separablity condition identified by Donoho and Stodden in 2003, and is the firstPolynomial-time algorithm that provably works under a non-trivial condition on the input matrix.
Robust Estimators in High Dimensions without the Computational Intractability
TLDR
This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.
Settling the Polynomial Learnability of Mixtures of Gaussians
TLDR
This paper gives the first polynomial time algorithm for proper density estimation for mixtures of k Gaussians that needs no assumptions on the mixture, and proves that such a dependence is necessary.
Being Robust (in High Dimensions) Can Be Practical
TLDR
This work addresses sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions.
Simple, Efficient, and Neural Algorithms for Sparse Coding
TLDR
This work gives a general framework for understanding alternating minimization which it leverage to analyze existing heuristics and to design new ones also with provable guarantees, and gives the first efficient algorithm for sparse coding that works almost up to the information theoretic limit for sparse recovery on incoherent dictionaries.
New Algorithms for Learning Incoherent and Overcomplete Dictionaries
TLDR
This paper presents a polynomial-time algorithm for learning overcomplete dictionaries; the only previously known algorithm with provable guarantees is the recent work of Spielman, Wang and Wright who gave an algorithm for the full-rank case.
Efficiently learning mixtures of two Gaussians
TLDR
This work provides a polynomial-time algorithm for this problem for the case of two Gaussians in $n$ dimensions (even if they overlap), with provably minimal assumptions on theGaussians, and polynometric data requirements, and efficiently performs near-optimal clustering.
Noisy tensor completion via the sum-of-squares hierarchy
TLDR
The main technical result is in characterizing the Rademacher complexity of the sequence of norms that arise in the sum-of-squares relaxations to the tensor nuclear norm.
...
...