• Corpus ID: 219965583

Learning of Discrete Graphical Models with Neural Networks

@article{Jayakumar2020LearningOD,
  title={Learning of Discrete Graphical Models with Neural Networks},
  author={Abhijith Jayakumar and Andrey Y. Lokhov and Sidhant Misra and Marc Vuffray},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.11937}
}
Graphical models are widely used in science to represent joint probability distributions with an underlying conditional dependence structure. The inverse problem of learning a discrete graphical model given i.i.d samples from its joint distribution can be solved with near-optimal sample complexity using a convex optimization method known as Generalized Regularized Interaction Screening Estimator (GRISE). But the computational cost of GRISE becomes prohibitive when the energy function of the… 

Figures from this paper

Learning Continuous Exponential Families Beyond Gaussian

This work introduces a computationally efficient method for learning continuous graphical models based on the Interaction Screening approach that maintains similar requirements in terms of accuracy and sample complexity compared to alternative approaches such as maximization of conditional likelihood, while considerably improving upon the algorithm’s run-time.

Reconstruction of pairwise interactions using energy-based models

It is shown that hybrid models, which combine a pairwise model and a neural network, can lead to significant improvements in the reconstruction of pairwise interactions, and this work proposes an approach based on energy-based models and pseudolikelihood maximization to address these complications.

Efficient learning of discrete graphical models

This work provides the first sample-efficient method based on the interaction screening framework that allows one to provably learn fully general discrete factor models with node-specific discrete alphabets and multi-body interactions, specified in an arbitrary basis.

References

SHOWING 1-10 OF 33 REFERENCES

Efficient learning of discrete graphical models

This work provides the first sample-efficient method based on the interaction screening framework that allows one to provably learn fully general discrete factor models with node-specific discrete alphabets and multi-body interactions, specified in an arbitrary basis.

Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions

The information-theoretic limitations of the problem of graph selection for binary Markov random fields under high-dimensional scaling, in which the graph size and the number of edges k, and/or the maximal node degree d, are allowed to increase to infinity as a function of the sample size n, are analyzed.

Optimal structure and parameter learning of Ising models

This study shows that the interaction screening method is an exact, tractable, and optimal technique that universally solves the inverse Ising problem.

Efficiently Learning Ising Models on Arbitrary Graphs

A simple greedy procedure allows to learn the structure of an Ising model on an arbitrary bounded-degree graph in time on the order of p2, and it is shown that for any node there exists at least one neighbor with which it has a high mutual information.

Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models

We consider the problem of learning the underlying graph of an unknown Ising model on p spins from a collection of i.i.d. samples generated from the model. We suggest a new estimator that is

High-dimensional Ising model selection using ℓ1-regularized logistic regression

It is proved that consistent neighborhood selection can be obtained for sample sizes $n=\Omega(d^3\log p)$ with exponentially decaying error, and when these same conditions are imposed directly on the sample matrices, it is shown that a reduced sample size suffices for the method to estimate neighborhoods consistently.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Approximating discrete probability distributions with dependence trees

It is shown that the procedure derived in this paper yields an approximation of a minimum difference in information when applied to empirical observations from an unknown distribution of tree dependence, and the procedure is the maximum-likelihood estimate of the distribution.

Universal approximation bounds for superpositions of a sigmoidal function

  • A. Barron
  • Computer Science
    IEEE Trans. Inf. Theory
  • 1993
The approximation rate and the parsimony of the parameterization of the networks are shown to be advantageous in high-dimensional settings and the integrated squared approximation error cannot be made smaller than order 1/n/sup 2/d/ uniformly for functions satisfying the same smoothness assumption.