• Corpus ID: 7881866

Optimal Estimation and Completion of Matrices with Biclustering Structures

@article{Gao2016OptimalEA,
  title={Optimal Estimation and Completion of Matrices with Biclustering Structures},
  author={Chao Gao and Yu Lu and Zongming Ma and Harrison H. Zhou},
  journal={J. Mach. Learn. Res.},
  year={2016},
  volume={17},
  pages={161:1-161:29}
}
Biclustering structures in data matrices were first formalized in a seminal paper by John Hartigan (1972) where one seeks to cluster cases and variables simultaneously. Such structures are also prevalent in block modeling of networks. In this paper, we develop a unified theory for the estimation and completion of matrices with biclustering structures, where the data is a partially observed and noise contaminated data matrix with a certain biclustering structure. In particular, we show that a… 

Figures and Tables from this paper

On Variational Inference in Biclustering Models
TLDR
The precise estimation bound of variational estimator is obtained and it is shown that it matches the minimax rate in terms of estimation error under mild assumptions in biclustering setting.
Profile likelihood biclustering
TLDR
A new heuristic optimization procedure based on the Kernighan-Lin heuristic, which has nice computational properties and performs well in simulations, is proposed and proved that the procedure recovers the true row and column classes when the dimensions of the data matrix tend to infinity.
Optimal Bipartite Network Clustering
TLDR
A fast two-stage procedure based on spectral initialization followed by the application of a pseudo-likelihood classifier twice that achieves the optimal convergence rate that is achievable by a biclustering oracle, adaptively over the whole class, up to constants is presented.
Local Inference by Penalization Method for Biclustering Model
TLDR
There is (almost) no deceptiveness issue for the uncertainty quantification problem in the biclustering model and the confidence set is constructed and its local (oracle) optimality is established.
Bayesian Model Selection with Graph Structured Sparsity
TLDR
An EM-type algorithm with closed-form iterations is derived to efficiently explore possible candidates for Bayesian model selection using the notion of effective resistance and the deterministic nature of the proposed algorithm makes it more scalable to large-scale and high-dimensional data sets compared with existing stochastic search algorithms.
Subspace Estimation from Unbalanced and Incomplete Data Matrices: 𝓁2, ∞ Statistical Guarantees
TLDR
This paper investigates an efficient spectral method, which operates upon the sample Gram matrix with diagonal deletion, and establishes new statistical guarantees for this method in terms of both 2 and 2,∞ estimation accuracy, which improve upon prior results if d2 is substantially larger than d1.
Lattice partition recovery with dyadic CART
TLDR
It is proved that, under appropriate regularity conditions on the shape of the partition elements, a DCART-based procedure consistently estimates the underlying partition at a rate of order σ2k∗ log(N)/κ, where k∗ is the minimal number of rectangular sub-graphs obtained using recursive dyadic partitions supporting the signal partition, σ is the noise variance.
Nonparametric Matrix Estimation with One-Sided Covariates
  • C. Yu
  • Computer Science
    ArXiv
  • 2021
TLDR
An algorithm and accompanying analysis is provided which shows that the algorithm improves upon naively estimating each row separately when the number of rows is not too small and achieves the minimax optimal nonparametric rate of an oracle algorithm that knows the row covariates.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 45 REFERENCES
Co-clustering of Nonsmooth Graphons
TLDR
It is shown that co-clusters found by any method can be extended to the row and column populations, or equivalently that the estimated blockmodel approximates a blocked version of the generative graphon, with estimation error bounded by $O_P(n^{-1/2})$.
Consistent nonparametric estimation for heavy-tailed sparse graphs
TLDR
This work studies graphons as a non-parametric generalization of stochastic block models, and shows how to obtain compactly represented estimators for sparse networks in this framework and characterize identifiability for these graphons and their underlying position spaces.
Oracle inequalities for network models and sparse graphon estimation
TLDR
This work constructs estimators of network connection probabilities -- the ordinary block constant least squares estimator, and its restricted version, which satisfy oracle inequalities with respect to the block constant oracle and derive optimal rates of estimation of the probability matrix.
Co-clustering separately exchangeable network data
TLDR
Stochastic blockmodels are established in addressing the co-clustering problem of partitioning a binary array into subsets, and it is shown for large sample sizes that the detection of co-Clusters in such data indicates with high probability the existence ofCo-clusters of equal size and asymptotically equivalent connectivity in the underlying generative process.
Biclustering via Sparse Singular Value Decomposition
Summary Sparse singular value decomposition (SSVD) is proposed as a new exploratory analysis tool for biclustering or identifying interpretable row–column associations within high‐dimensional data
Nuclear norm penalization and optimal rates for noisy low rank matrix completion
TLDR
A new nuclear norm penalized estimator of A_0 is proposed and a general sharp oracle inequality for this estimator is established for arbitrary values of $n,m_1,m-2$ under the condition of isometry in expectation to find the best trace regression model approximating the data.
Rate-optimal graphon estimation
TLDR
This paper establishes optimal rate of convergence for graphon estimation in a H\"{o}lder class with smoothness $\alpha$, which is, to the surprise, identical to the classical nonparametric rate.
Exact Matrix Completion via Convex Optimization
TLDR
It is proved that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries, and that objects other than signals and images can be perfectly reconstructed from very limited information.
Stochastic blockmodel approximation of a graphon: Theory and consistent estimation
TLDR
This paper proposes a computationally efficient procedure to estimate a graphon from a set of observed networks generated from it based on a stochastic blockmodel approximation (SBA) of the graphon, and shows that the estimation error vanishes as the size of thegraph approaches infinity.
The Power of Convex Relaxation: Near-Optimal Matrix Completion
TLDR
This paper shows that, under certain incoherence assumptions on the singular vectors of the matrix, recovery is possible by solving a convenient convex program as soon as the number of entries is on the order of the information theoretic limit (up to logarithmic factors).
...
1
2
3
4
5
...