Sparsifying priors for Bayesian uncertainty quantification in model discovery

  title={Sparsifying priors for Bayesian uncertainty quantification in model discovery},
  author={Seth M. Hirsh and David A. Barajas-Solano and J. Nathan Kutz},
  journal={Royal Society Open Science},
We propose a probabilistic model discovery method for identifying ordinary differential equations governing the dynamics of observed multivariate data. Our method is based on the sparse identification of nonlinear dynamics (SINDy) framework, where models are expressed as sparse linear combinations of pre-specified candidate functions. Promoting parsimony through sparsity leads to interpretable models that generalize to unknown data. Instead of targeting point estimates of the SINDy coefficients… 

Figures and Tables from this paper

Bayesian autoencoders for data-driven discovery of coordinates, governing equations and fundamental constants

The Bayesian SINDy autoencoder achieves better physics discovery with lower data and fewer training epochs, along with valid uncertainty quantification suggested by the experimental studies, and is applied to real video data.

Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control

This work leverages the statistical approach of bootstrap aggregating (bagging) to robustify the sparse identification of the nonlinear dynamics (SINDy) algorithm and shows that ensemble statistics from E-Sindy can be exploited for active learning and improved model predictive control.

Bayesian operator inference for data-driven reduced-order modeling

A Toolkit for Data-Driven Discovery of Governing Equations in High-Noise Regimes

An extensive toolkit of methods for circumventing the deleterious effects of noise in the context of the SINDy framework, and a technique that uses linear dependencies among functionals to transform a discovered model into an equivalent form that is closest to the true model, enabling more accurate assessment of a discoveredmodel’s correctness.

Approximating a Laplacian Prior for Joint State and Model Estimation within an UKF

: A major challenge in state estimation with model-based observers are low-quality models that lack of relevant dynamics. We address this issue by simultaneously estimating the system’s states and

Automated Learning of Interpretable Models with Quantified Uncertainty

A Bayesian Approach for Data-Driven Dynamic Equation Discovery

A Bayesian data-driven approach to nonlinear dynamic equation discovery is presented, which can accommodate measurement noise and missing data, which are common in complex nonlinear systems, and accounts for model parameter uncertainty.

Equation discovery from data: promise and pitfalls, from rabbits to Mars

The problem of equation discovery seeks to reconstruct the underlying dynamics of a time-varying system from observations of the system, and moreover to do so in an instructive way such that we may

A Bayesian Approach for Spatio-Temporal Data-Driven Dynamic Equation Discovery

A Bayesian approach to data-driven discovery of non-linear spatio-temporal dynamic equations is developed that can accommodate measurement noise and missing data, both of which are common in real-world data, and accounts for parameter uncertainty.

Data-driven discovery of governing equations for coarse-grained heterogeneous network dynamics

This work uses data-driven model discovery methods to determine the governing equations for the emergent behavior of heterogeneous networked dynamical systems whose collective behaviour approaches a limit cycle, and provides a numerical exploration of the dimension of collective network dynamics as a function of several network parameters.



Bayesian differential programming for robust systems identification under uncertainty

This paper presents a machine learning framework for Bayesian systems identification from noisy, sparse and irregular observations of nonlinear dynamical systems. The proposed method takes advantage

Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data

A variant of the SINDy algorithm that integrates automatic differentiation and recent time-stepping constrained motivated by Rudy et al is developed, which can learn a diversity of probability distributions for the measurement noise, including Gaussian, uniform, Gamma, and Rayleigh distributions.

Bayesian system ID: optimal management of parameter, model, and measurement uncertainty

A first principles derivation of appropriate objective formulations for system identification based on probabilistic principles is studied and the resulting inference objective is compared to those used by emerging data-driven methods based on dynamic mode decomposition (DMD) and system identification of nonlinear dynamics (SINDy).

Dirichlet–Laplace Priors for Optimal Shrinkage

This article proposes a new class of Dirichlet–Laplace priors, which possess optimal posterior concentration and lead to efficient posterior computation.

The Bayesian Lasso

The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors.


Under compatibility conditions on the design matrix, the posterior distribution is shown to contract at the optimal rate for recovery of the unknown sparse vector, and to give optimal prediction of the response vector.

Sparsity information and regularization in the horseshoe and other shrinkage priors

A concept of effective number of nonzero parameters is introduced, an intuitive way of formulating the prior for the global hyperparameter based on the sparsity assumptions is shown, and the previous default choices are argued to be dubious based on their tendency to favor solutions with more unshrunk parameters than the authors typically expect a priori.


This paper considers Bayesian counterparts of the classical tests for good- ness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior

Inferring Biological Networks by Sparse Identification of Nonlinear Dynamics

This method, implicit-SINDy, succeeds in inferring three canonical biological models: 1) Michaelis-Menten enzyme kinetics; 2) the regulatory network for competence in bacteria; and 3) the metabolic network for yeast glycolysis.

A Unified Sparse Optimization Framework to Learn Parsimonious Physics-Informed Models From Data

A flexible ML-based framework for learning governing models for physical systems from data that addresses three open challenges in scientific problems and data sets, including robust handling of outliers and corrupt data within noisy sensor measurements, parametric dependencies in candidate library functions, and the imposition of physical constraints.