• Publications
  • Influence
Robust Estimators in High Dimensions without the Computational Intractability
TLDR
This work obtains the first computationally efficient algorithms for agnostically learning several fundamental classes of high-dimensional distributions: a single Gaussian, a product distribution on the hypercube, mixtures of two product distributions (under a natural balancedness condition), and k Gaussians with identical spherical covariances.
Sparser Johnson-Lindenstrauss Transforms
TLDR
These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas and Dasgupta et al.
An optimal algorithm for the distinct elements problem
TLDR
The first optimal algorithm for estimating the number of distinct elements in a data stream is given, closing a long line of theoretical research on this problem, and has optimal O(1) update and reporting times.
Being Robust (in High Dimensions) Can Be Practical
TLDR
This work addresses sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions.
Sever: A Robust Meta-Algorithm for Stochastic Optimization
TLDR
This work introduces a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers, and finds that in both cases it has substantially greater robustness than several baselines.
A New Approach for Testing Properties of Discrete Distributions
TLDR
The sample complexity of the algorithm depends on the structure of the unknown distributions - as opposed to merely their domain size - and is significantly better compared to the worst-case optimal L1-tester in many natural instances.
Statistical Query Lower Bounds for Robust Estimation of High-Dimensional Gaussians and Gaussian Mixtures
TLDR
A general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems involving Gaussian distributions is described, which implies that the computational complexity of learning GMMs is inherently exponential in the dimension of the latent space even though there is no such information-theoretic barrier.
On the complexity of two-player win-lose games
TLDR
It is shown that the complexity of two-player Nash equilibria is unchanged when all outcomes are restricted to be 0 or 1, meaning that win-or-lose games are as complex as the general case for two- player games.
Robust Estimators in High-Dimensions Without the Computational Intractability
We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples. Such questions have a rich hist...
The geometry of binary search trees
TLDR
It is shown that there exists an equal-cost online algorithm, transforming the conjecture of Lucas and Munro into the conjecture that the greedy algorithm is dynamically optimal, and achieving a new lower bound for searching in the BST model.
...
1
2
3
4
5
...