On the Existence of Optimal Unions of Subspaces for Data Modeling and Clustering

Abstract

Given a set of vectors F = {f1, . . . , fm} in a Hilbert space H, and given a family C of closed subspaces of H, the subspace clustering problem consists in finding a union of subspaces in C that best approximates (nearest to) the data F. This problem has applications and connections to many areas of mathematics, computer science and engineering such as Generalized Principle Component Analysis (GPCA), learning theory, compressed sensing, and sampling with finite rate of innovation. In this paper, we characterize families of subspaces C for which such a best approximation exists. In finite dimensions the characterization is in terms of the convex hull of an augmented set C+. In infinite dimensions however, the characterization is in terms of a new but related notion of contact half-spaces. As an application, the existence of best approximations from π(G)-invariant families C of unitary representations of abelian groups is derived.

DOI: 10.1007/s10208-011-9086-4

Extracted Key Phrases

Cite this paper

@article{Aldroubi2011OnTE, title={On the Existence of Optimal Unions of Subspaces for Data Modeling and Clustering}, author={Akram Aldroubi and Romain Tessera}, journal={Foundations of Computational Mathematics}, year={2011}, volume={11}, pages={363-379} }