• Corpus ID: 235731566

Learning Bayesian Networks through Birkhoff Polytope: A Relaxation Method

  title={Learning Bayesian Networks through Birkhoff Polytope: A Relaxation Method},
  author={Aramayis Dallakyan and Mohsen Pourahmadi},
We establish a novel framework for learning a directed acyclic graph (DAG) when data are generated from a Gaussian, linear structural equation model. It consists of two parts: (1) introduce a permutation matrix as a new parameter within a regularized Gaussian log-likelihood to represent variable ordering; and (2) given the ordering, estimate the DAG structure through sparse Cholesky factor of the inverse covariance matrix. For permutation matrix estimation, we propose a relaxation technique… 

Figures and Tables from this paper


Optimizing Regularized Cholesky Score for Order-Based Learning of Bayesian Networks
This work proposes a novel structure learning method, annealing on regularized Cholesky score (ARCS), to search over topological sorts, or permutations of nodes, for a high-scoring Bayesian network, and establishes the consistency of the scoring function in estimating topologicalsort and DAG structures in the large-sample limit.
High-dimensional learning of linear causal networks via inverse covariance estimation
It is shown that when the error variances are known or estimated to close enough precision, the true DAG is the unique minimizer of the score computed using the reweighted squared l2-loss.
Learning Bayesian Network Structure using LP Relaxations
This work proposes to solve the combinatorial problem ofding the highest scoring Bayesian network structure from data by maintaining an outer bound approximation to the polytope and iteratively tighten it by searching over a new class of valid constraints.
Learning Bayesian network structure: Towards the essential graph by integer linear programming tools
Extensions of characteristic imsets are considered which additionally encode chain graphs without flags equivalent to acyclic directed graphs, and a polyhedral description of the respective domain of the ILP problem, that is, by means of a set of linear inequalities is made.
Concave penalized estimation of sparse Gaussian Bayesian networks
This work develops a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data using concave regularization and provides theoretical guarantees which generalize existing asymptotic results when the underlying distribution is Gaussian.
Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs.
This paper proposes an efficient penalized likelihood method for estimation of the adjacency matrix of directed acyclic graphs, and shows that although the lasso is only variable selection consistent under stringent conditions, the adaptive lasso can consistently estimate the true graph under the usual regularity assumptions.
Learning Sparse Nonparametric DAGs
A completely general framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data is developed that can be applied to general nonlinear models, general differentiable loss functions, and generic black-box optimization routines.
Learning Local Dependence In Ordered Data
  • Guo Yu, J. Bien
  • Mathematics, Computer Science
    J. Mach. Learn. Res.
  • 2017
This work proposes a framework for learning local dependence based on estimating the inverse of the Cholesky factor of the covariance matrix, which yields a simple regression interpretation for local dependence in which variables are predicted by their neighbors.
DAGs with NO TEARS: Continuous Optimization for Structure Learning
This paper forms the structure learning problem as a purely continuous optimization problem over real matrices that avoids this combinatorial constraint entirely and achieves a novel characterization of acyclicity that is not only smooth but also exact.
A Simple Approach for Finding the Globally Optimal Bayesian Network Structure
It is shown that it is possible to learn the best Bayesian network structure with over 30 variables, which covers many practically interesting cases and offers a possibility for efficient exploration of the best networks consistent with different variable orderings.