Robust high dimensional expectation maximization algorithm via trimmed hard thresholding

@article{Wang2020RobustHD,
title={Robust high dimensional expectation maximization algorithm via trimmed hard thresholding},
author={Di Wang and Xiangyu Guo and Shi Li and Jinhui Xu},
journal={Machine Learning},
year={2020},
volume={109},
pages={2283 - 2311}
}
• Published 19 October 2020
• Computer Science, Mathematics
• Machine Learning
In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space (i.e., d≫n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\gg n$$\end{document}) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called…
1 Citations
• Environmental Science, Mathematics
Earth Science Informatics
• 2022
The determination of the optimal change threshold is an essential step for very high-resolution (VHR) remote sensing change detection. Although a number of change threshold selection methods have

References

SHOWING 1-10 OF 41 REFERENCES

• Computer Science
NeurIPS
• 2018
A variant of stochastic gradient descent (SGD) which finds $\varepsilon$-approximate minimizers of convex functions in $T = O(\tilde{O}\big)$ iterations, which is information-theoretically optimal both in terms of sampling complexity and time complexity.
• Computer Science
ArXiv
• 2019
This work studies the problem of sparsity constrained M -estimation with arbitrary corruptions to both explanatory and response variables in the high-dimensional regime, and develops a highly efficient gradient-based optimization algorithm that is a robust variant of Iterative Hard Thresholding by using trimmed mean in gradient computations.
• Mathematics, Computer Science
COLT
• 2019
A nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions is provided, based on a novel variant of outlier removal via hard thresholding in which the threshold is chosen adaptively and crucially relies on randomness to escape bad fixed points of the non-convexhard thresholding operation.
• Computer Science
NIPS
• 2011
This work is able to both analyze the statistical error associated with any global optimum, and prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers.
• Computer Science
COLT
• 2017
The theory identifies a unified set of deterministic conditions under which the algorithm guarantees accurate recovery of sparse functionals, and provides a novel algorithm based on the same intuition which is able to take advantage of further structure of the problem to achieve nearly optimal rates.
• Computer Science, Mathematics
NeurIPS
• 2019
It is proved that the $\ell_1$-penalized Huber's M-estimator based on $n$ samples attains the optimal rate of convergence, up to a logarithmic factor, in the case where the labels are contaminated by at most adversarial outliers.
• Computer Science
ICML
• 2017
A generic stochastic expectationmaximization (EM) algorithm for the estimation of high-dimensional latent variable models with a novel semi-stochastic variance-reduced gradient designed for the Qfunction in the EM algorithm is proposed.
• Computer Science
2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)
• 2017
A general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems involving Gaussian distributions is described, which implies that the computational complexity of learning GMMs is inherently exponential in the dimension of the latent space even though there is no such information-theoretic barrier.
For both of these problems, the natural robust version of two classical sparse estimation problems, namely, sparse mean estimation and sparse PCA in the spiked covariance model, is studied, providing the first efficient algorithms that provide non-trivial error guarantees in the presence of noise.
• Computer Science
ICML
• 2013
Three popular algorithms in the uncorrupted setting: Thresholding Regression, Lasso, and the Dantzig selector are considered, and it is shown that the counterparts obtained using the trimmed inner product are provably robust.