Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models

@article{BoixAdser2022ChowLiuOP,
  title={Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models},
  author={Enric Boix-Adser{\`a} and Guy Bresler and Frederic Koehler},
  journal={2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS)},
  year={2022},
  pages={417-426}
}
We consider the problem of learning a tree-structured Ising model from data, such that subsequent predictions computed using the model are accurate. Con-cretely, we aim to learn a model such that posteriors $p$ (Xi| X s) for small sets of variables $S$ are accurate. Since its introduction more than 50 years ago, the Chow-Liu algorithm, which efficiently computes the maximum likelihood tree, has been the benchmark algorithm for learning tree-structured graphical models. A bound on the sample… 

Figures from this paper

Active-LATHE: An Active Learning Algorithm for Boosting the Error Exponent for Learning Homogeneous Ising Trees
TLDR
An algorithm Active Learning Algorithm for Trees with Homogeneous Edges (Active-LATHE), which surprisingly boosts the error exponent by at least 40% when ρ is at least 0.8, and hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph in which the authors are less confident of being correct.
Prediction-Centric Learning of Independent Cascade Dynamics from Partial Observations
TLDR
This work introduces a computationally efficient algorithm, based on a scalable dynamic message-passing approach, which is able to learn parameters of the effective spreading model given only limited information on the activation times of nodes in the network.
Learning Linear Non-Gaussian Polytree Models
  • Computer Science
  • 2022
TLDR
This approach combines the Chow–Liu algorithm, which first learns the undirected tree structure, with novel schemes to orient the edges by imposing an algebraic condition on moments of the data-generating distribution, gaining significant efficiency over current algorithms that assess conditional independence.

References

SHOWING 1-10 OF 51 REFERENCES
Learning a Tree-Structured Ising Model in Order to Make Predictions
TLDR
One of the main messages of this paper is that far fewer samples are needed than for recovering the underlying tree, which means that accurate predictions are possible using the wrong tree.
Tree-structured Ising models can be learned efficiently
TLDR
It is shown that n-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance from an optimal $O(n \log n/\epsilon^2)$ samples, where O(.)$ hides an absolute constant which does not depend on the model being learned.
Near-optimal learning of tree-structured distributions by Chow-Liu
TLDR
The upper bound is based on a new conditional independence tester that addresses an open problem posed by Canonne, Diakonikolas, Kane, and Stewart (STOC, 2018): it is proved that for three random variables X,Y,Z each over Σ, testing if I(X; Y ∣ Z) is 0 or ≥ ε is possible with O(|Σ|3/ε) samples.
Predictive Learning on Hidden Tree-Structured Ising Models
TLDR
This paper quantifies how noise in the hidden model impacts the sample complexity of structure learning and marginal distributions' estimation by proving upper and lower bounds on thesample complexity.
High-dimensional Ising model selection using ℓ1-regularized logistic regression
TLDR
It is proved that consistent neighborhood selection can be obtained for sample sizes $n=\Omega(d^3\log p)$ with exponentially decaying error, and when these same conditions are imposed directly on the sample matrices, it is shown that a reduced sample size suffices for the method to estimate neighborhoods consistently.
Learning Tree Structures from Noisy Data
TLDR
The impact of measurement noise on the task of learning the underlying tree structure via the well-known Chow-Liu algorithm is studied and formal sample complexity guarantees for exact recovery are provided.
Efficiently Learning Ising Models on Arbitrary Graphs
TLDR
A simple greedy procedure allows to learn the structure of an Ising model on an arbitrary bounded-degree graph in time on the order of p2, and it is shown that for any node there exists at least one neighbor with which it has a high mutual information.
Robust Estimation of Tree Structured Ising Models
TLDR
This paper focuses on the problem of robust estimation of tree-structured Ising models, and proves that this problem is unidentifiable, however, this unidentifiability is limited to a small equivalence class of trees formed by leaf nodes exchanging positions with their neighbors.
A Large-Deviation Analysis of the Maximum-Likelihood Learning of Markov Tree Structures
TLDR
This work analyzes the scenario of ML-estimation in the very noisy learning regime and shows that the error exponent can be approximated as a ratio, which is interpreted as the signal-to-noise ratio (SNR) for learning tree distributions.
Learning Graphical Models Using Multiplicative Weights
TLDR
An algorithm for learning the structure of t-wise MRFs with nearly-optimal sample complexity and running time that is n^t is given, which is the first solution to the problem of learning sparse Generalized Linear Models (GLMs).
...
...