Robust high dimensional expectation maximization algorithm via trimmed hard thresholding

  title={Robust high dimensional expectation maximization algorithm via trimmed hard thresholding},
  author={Di Wang and Xiangyu Guo and Shi Li and Jinhui Xu},
  journal={Machine Learning},
  pages={2283 - 2311}
In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in high dimensional space (i.e., d≫n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\gg n$$\end{document}) where the underlying parameter is assumed to be sparse. Specifically, we propose a method called… 
1 Citations

A comparative study of threshold selection methods for change detection from very high-resolution remote sensing images

The determination of the optimal change threshold is an essential step for very high-resolution (VHR) remote sensing change detection. Although a number of change threshold selection methods have



Byzantine Stochastic Gradient Descent

A variant of stochastic gradient descent (SGD) which finds $\varepsilon$-approximate minimizers of convex functions in $T = O(\tilde{O}\big)$ iterations, which is information-theoretically optimal both in terms of sampling complexity and time complexity.

High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding

This work studies the problem of sparsity constrained M -estimation with arbitrary corruptions to both explanatory and response variables in the high-dimensional regime, and develops a highly efficient gradient-based optimization algorithm that is a robust variant of Iterative Hard Thresholding by using trimmed mean in gradient computations.

Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

A nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions is provided, based on a novel variant of outlier removal via hard thresholding in which the threshold is chosen adaptively and crucially relies on randomness to escape bad fixed points of the non-convexhard thresholding operation.

High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity

This work is able to both analyze the statistical error associated with any global optimum, and prove that a simple algorithm based on projected gradient descent will converge in polynomial time to a small neighborhood of the set of all global minimizers.

Computationally Efficient Robust Sparse Estimation in High Dimensions

The theory identifies a unified set of deterministic conditions under which the algorithm guarantees accurate recovery of sparse functionals, and provides a novel algorithm based on the same intuition which is able to take advantage of further structure of the problem to achieve nearly optimal rates.

Outlier-robust estimation of a sparse linear model using 𝓁1-penalized Huber's M-estimator

It is proved that the $\ell_1$-penalized Huber's M-estimator based on $n$ samples attains the optimal rate of convergence, up to a logarithmic factor, in the case where the labels are contaminated by at most adversarial outliers.

High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm

A generic stochastic expectationmaximization (EM) algorithm for the estimation of high-dimensional latent variable models with a novel semi-stochastic variance-reduced gradient designed for the Qfunction in the EM algorithm is proposed.

Statistical Query Lower Bounds for Robust Estimation of High-Dimensional Gaussians and Gaussian Mixtures

A general technique that yields the first Statistical Query lower bounds for a range of fundamental high-dimensional learning problems involving Gaussian distributions is described, which implies that the computational complexity of learning GMMs is inherently exponential in the dimension of the latent space even though there is no such information-theoretic barrier.

Robust Sparse Estimation Tasks in High Dimensions

For both of these problems, the natural robust version of two classical sparse estimation problems, namely, sparse mean estimation and sparse PCA in the spiked covariance model, is studied, providing the first efficient algorithms that provide non-trivial error guarantees in the presence of noise.

Robust Sparse Regression under Adversarial Corruption

Three popular algorithms in the uncorrupted setting: Thresholding Regression, Lasso, and the Dantzig selector are considered, and it is shown that the counterparts obtained using the trimmed inner product are provably robust.