Corpus ID: 202712534

Non-Parametric Structure Learning on Hidden Tree-Shaped Distributions

  title={Non-Parametric Structure Learning on Hidden Tree-Shaped Distributions},
  author={Konstantinos E. Nikolakakis and Dionysios S. Kalogerias and Anand D. Sarwate},
We provide high probability sample complexity guarantees for non-parametric structure learning of tree-shaped graphical models whose nodes are discrete random variables with a finite or countable alphabet, both in the noiseless and noisy regimes. First, we introduce a new, fundamental quantity called the (noisy) information threshold, which arises naturally from the error analysis of the Chow-Liu algorithm and characterizes not only the sample complexity, but also the inherent impact of the… Expand
SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples
This paper presents a novel impossibility result by deriving a bound on the necessary number of samples for partial recovery of Katiyar et al. (2020), and proposes Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery. Expand
Exact Asymptotics for Learning Tree-Structured Graphical Models With Side Information: Noiseless and Noisy Samples
Theoretical results demonstrate keen agreement with experimental results for sample sizes as small as that in the hundreds, and refine the large deviation results of Tan et al. (2011) and strictly improve those of Bresler and Karzand (2020). Expand
Robust Estimation of Tree Structured Ising Models
This paper focuses on the problem of robust estimation of tree-structured Ising models, and proves that this problem is unidentifiable, however, this unidentifiability is limited to a small equivalence class of trees formed by leaf nodes exchanging positions with their neighbors. Expand


Learning Tree Structures from Noisy Data
The impact of measurement noise on the task of learning the underlying tree structure via the well-known Chow-Liu algorithm is studied and formal sample complexity guarantees for exact recovery are provided. Expand
Predictive Learning on Sign-Valued Hidden Markov Trees
This paper quantifies how noise in the hidden model impacts the sample complexity of structure learning and predictive distributional inference by proving upper and lower bounds on thesample complexity. Expand
Predictive Learning on Hidden Tree-Structured Ising Models
This paper quantifies how noise in the hidden model impacts the sample complexity of structure learning and marginal distributions' estimation by proving upper and lower bounds on thesample complexity. Expand
Learning High-Dimensional Markov Forest Distributions: Analysis of Error Rates
It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size, and it is proved that the independent tree model is the hardest to learn using the proposed algorithm in terms of error rates for structure learning. Expand
Learning a Tree-Structured Ising Model in Order to Make Predictions
One of the main messages of this paper is that far fewer samples are needed than for recovering the underlying tree, which means that accurate predictions are possible using the wrong tree. Expand
Learning Gaussian Tree Models: Analysis of Error Exponents and Extremal Structures
It is shown that the extremal tree structure that minimizes the error exponent is the star for any fixed set of correlation coefficients on the edges of the tree and that the Markov chain graphs represent the hardest and the easiest structures to learn in the class of tree-structured Gaussian graphical models. Expand
Hardness of parameter estimation in graphical models
The main result shows that parameter estimation is in general intractable: no algorithm can learn the canonical parameters of a generic pair-wise binary graphical model from the mean parameters in time bounded by a polynomial in the number of variables. Expand
Learning of Tree-Structured Gaussian Graphical Models on Distributed Data Under Communication Constraints
This paper presents a set of communication-efficient strategies, which are theoretically proved to convey sufficient information for reliable learning of the structure of tree-structured Gaussian graphical models from distributed data. Expand
Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions
The information-theoretic limitations of the problem of graph selection for binary Markov random fields under high-dimensional scaling, in which the graph size and the number of edges k, and/or the maximal node degree d, are allowed to increase to infinity as a function of the sample size n, are analyzed. Expand
A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures
This work analyzes the scenario of ML-estimation in the very noisy learning regime and shows that the error exponent can be approximated as a ratio, which is interpreted as the signal-to-noise ratio (SNR) for learning tree distributions. Expand