# Subspace Clustering with Missing and Corrupted Data

@article{Charles2017SubspaceCW, title={Subspace Clustering with Missing and Corrupted Data}, author={Zachary B. Charles and Amin Jalali and Rebecca M. Willett}, journal={arXiv: Machine Learning}, year={2017} }

Given full or partial information about a collection of points that lie close to a union of several subspaces, subspace clustering refers to the process of clustering the points according to their subspace and identifying the subspaces. One popular approach, sparse subspace clustering (SSC), represents each sample as a weighted combination of the other samples, with weights of minimal $\ell_1$ norm, and then uses those learned weights to cluster the samples. SSC is stable in settings where each…

## Figures from this paper

## 8 Citations

SPARSE SUBSPACE CLUSTERING WITH MISSING AND CORRUPTED DATA

- Computer Science2018 IEEE Data Science Workshop (DSW)
- 2018

This paper studies a robust variant of sparse subspace clustering (SSC) and gives explicit bounds on the amount of additive noise and the number of missing entries the algorithm can tolerate, both in deterministic settings and in a random generative model.

Theoretical Analysis of Sparse Subspace Clustering with Missing Entries

- Computer Science, MathematicsICML
- 2018

This paper analytically establishes that projecting the zero-filled data onto the observation pattern of the point being expressed leads to a substantial improvement in performance, and gives theoretical guarantees for SSC with incomplete data.

Evolutionary Self-Expressive Models for Subspace Clustering

- Computer ScienceIEEE Journal of Selected Topics in Signal Processing
- 2018

This work introduces evolutionary subspace clustering, a method whose objective is to cluster a collection of evolving data points that lie on a union of low-dimensional evolving subspaces, and proposes a non-convex optimization framework that exploits the self-expressiveness property of the evolving data while taking into account representation from the preceding time step.

Low-Rank Approximation of Matrices Via A Rank-Revealing Factorization with Randomization

- Computer Science, MathematicsICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020

This paper presents an algorithm called randomized pivoted TSOD (RP-TSOD) that constructs a highly accurate approximation to the TSOD decomposition through the exploitation of randomization and furnishes upper bounds on the error of the low-rank approximation and bounds for the canonical angles between the approximate and the exact singular subspaces.

Optimal Recovery of Missing Values for Non-Negative Matrix Factorization

- IEEE Open Journal of Signal Processing
- 2021

Missing values imputation is often evaluated on some similarity measure between actual and imputed data. However, it may be more meaningful to evaluate downstream algorithm performance after…

Optimal Recovery of Missing Values for Non-negative Matrix Factorization

- Biology, Mathematics
- 2019

Under certain geometric conditions, tight upper bounds on NMF relative error are proved, which is the first bound of this type for missing values for non-negative matrix factorization (NMF).

Tensor Methods for Nonlinear Matrix Completion

- Mathematics, Computer ScienceSIAM J. Math. Data Sci.
- 2021

A LADMC algorithm that leverages existing LRMC methods on a tensorized representation of the data and outperforms existing state-of-the-art methods for matrix completion under a union of subspaces model is proposed.

Efficient Low-Rank Approximation of Matrices Based on Randomized Pivoted Decomposition

- Mathematics, Computer ScienceIEEE Transactions on Signal Processing
- 2020

An algorithm called randomized pivoted TSOD (RP-TSOD) is presented, where the middle factor is lower triangular, and bounds for the canonical angles between the approximate and the exact singular subspaces are derived.

## References

SHOWING 1-10 OF 33 REFERENCES

Group-sparse subspace clustering with missing data

- Computer Science, Mathematics2016 IEEE Statistical Signal Processing Workshop (SSP)
- 2016

Two novel methods for subspace clustering with missing data are described: (a) group-sparse sub- space clustering (GSSC), which is based on group-sparsity and alternating minimization, and (b) mixture subspace clusters (MSC) which models each data point as a convex combination of its projections onto all subspaces in the union.

Theoretical Analysis of Sparse Subspace Clustering with Missing Entries

- Computer Science, MathematicsICML
- 2018

This paper analytically establishes that projecting the zero-filled data onto the observation pattern of the point being expressed leads to a substantial improvement in performance, and gives theoretical guarantees for SSC with incomplete data.

A Geometric Analysis of Subspace Clustering with Outliers

- Mathematics, Computer ScienceArXiv
- 2011

A novel geometric analysis of an algorithm named sparse subspace clustering (SSC) is developed, which signicantly broadens the range of problems where it is provably eective and shows that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension.

Sparse Subspace Clustering with Missing Entries

- Mathematics, Computer ScienceICML
- 2015

Two new approaches for subspace clustering and completion are proposed and evaluated, which all outperform the natural approach when the data matrix is high-rank or the percentage of missing entries is large.

Noisy Sparse Subspace Clustering

- Mathematics, Computer ScienceJ. Mach. Learn. Res.
- 2013

It is shown that a modified version of SSC is provably effective in correctly identifying the underlying subspaces, even with noisy data, which extends theoretical guarantee of this algorithm to the practical setting and provides justification to the success of SCC in a class of real applications.

Sparse subspace clustering

- Mathematics, Computer ScienceCVPR
- 2009

This work proposes a method based on sparse representation (SR) to cluster data drawn from multiple low-dimensional linear or affine subspaces embedded in a high-dimensional space and applies this method to the problem of segmenting multiple motions in video.

Robust Subspace Clustering

- Computer Science, MathematicsArXiv
- 2013

This paper introduces an algorithm inspired by sparse subspace clustering (SSC) to cluster noisy data, and develops some novel theory demonstrating its correctness.

The Information-Theoretic Requirements of Subspace Clustering with Missing Data

- Mathematics, Computer ScienceICML
- 2016

To derive deterministic sampling conditions for SCMD, which give precise information-theoretic requirements and determine sampling regimes, a practical algorithm is given to certify the output of any SCMD method deterministically.

High-Rank Matrix Completion and Clustering under Self-Expressive Models

- Computer Science, MathematicsNIPS
- 2016

This work proposes efficient algorithms for simultaneous clustering and completion of incomplete high-dimensional data that lie in a union of low-dimensional subspaces and shows that when the data matrix is low-rank, the algorithm performs on par with or better than low-Rank matrix completion methods, while for high-rank data matrices, the method significantly outperforms existing algorithms.

Generalized principal component analysis (GPCA)

- Mathematics, MedicineIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2005

An algebro-geometric solution to the problem of segmenting an unknown number of subspaces of unknown and varying dimensions from sample data points and applications of GPCA to computer vision problems such as face clustering, temporal video segmentation, and 3D motion segmentation from point correspondences in multiple affine views are presented.