- Published 2007

Given a matrix A ∈ Rm×n (n vectors in m dimensions), we consider the problem of selecting a submatrix (subset of the columns) with maximum volume. The motivation to study such a problem is that if A can be approximately reconstructed from a small number k of its columns (A has “numerical” rank k), then any set of k independent columns of A should suffice to reconstruct A. However, numerical stability results only if the chosen k have large volume. We thus define an appropriate algorithmic problemMax-Vol(k), which asks for the k columns with maximum volume. We show that Max-Vol is NP-hard, and in fact does not admit any PTAS. In particular, it is NP-hard to approximate Max-Vol within 2 √ 2 3 + ǫ. We study a natural greedy heuristic for Max-Vol and show that it has approximation ratio 2−O(k log . We show that our analysis of the greedy heuristic is tight to within a logarithmic factor in the exponent by giving an instance of Max-Vol for which the greedy heuristic is 2−Ω(k) from optimal. When A has unit norm columns, a related problem is to select the maximum number of vectors with a given volume (this pre-specified volume could be the volume required on grounds of numerical stability for the reconstruction). We show that if the optimal solution selects k columns, then greedy will select Ω( k log k ) columns, providing a log k-approximation.

@inproceedings{ivril2007FindingMV,
title={Finding Maximum Volume Sub-matrices of a Matrix},
author={Ali Çivril and Malik Magdon-Ismail},
year={2007}
}