Sparsifying priors for Bayesian uncertainty quantification in model discovery

  title={Sparsifying priors for Bayesian uncertainty quantification in model discovery},
  author={Seth M. Hirsh and David A. Barajas-Solano and J. Nathan Kutz},
  journal={Royal Society Open Science},
We propose a probabilistic model discovery method for identifying ordinary differential equations governing the dynamics of observed multivariate data. Our method is based on the sparse identification of nonlinear dynamics (SINDy) framework, where models are expressed as sparse linear combinations of pre-specified candidate functions. Promoting parsimony through sparsity leads to interpretable models that generalize to unknown data. Instead of targeting point estimates of the SINDy coefficients… 

Figures and Tables from this paper

Bayesian Spline Learning for Equation Discovery of Nonlinear Dynamics with Quantified Uncertainty

A novel Bayesian spline learning framework to identify parsimonious governing equations of nonlinear (spatio)temporal dynamics from sparse, noisy data with quantified uncertainty is developed and evaluated on multiple nonlinear dynamical systems governed by canonical ordinary and partial differential equations.

Bayesian autoencoders for data-driven discovery of coordinates, governing equations and fundamental constants

The Bayesian SINDy autoencoder achieves better physics discovery with lower data and fewer training epochs, along with valid uncertainty quantification suggested by the experimental studies, and is applied to real video data.

Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control

This work leverages the statistical approach of bootstrap aggregating (bagging) to robustify the sparse identification of the nonlinear dynamics (SINDy) algorithm and shows that ensemble statistics from E-Sindy can be exploited for active learning and improved model predictive control.

Bayesian operator inference for data-driven reduced-order modeling

A Toolkit for Data-Driven Discovery of Governing Equations in High-Noise Regimes

An extensive toolkit of methods for circumventing the deleterious effects of noise in the context of the SINDy framework, and a technique that uses linear dependencies among functionals to transform a discovered model into an equivalent form that is closest to the true model, enabling more accurate assessment of a discoveredmodel’s correctness.

Approximating a Laplacian Prior for Joint State and Model Estimation within an UKF

: A major challenge in state estimation with model-based observers are low-quality models that lack of relevant dynamics. We address this issue by simultaneously estimating the system’s states and

Automated Learning of Interpretable Models with Quantified Uncertainty

A Bayesian Approach for Data-Driven Dynamic Equation Discovery

A Bayesian data-driven approach to nonlinear dynamic equation discovery is presented, which can accommodate measurement noise and missing data, which are common in complex nonlinear systems, and accounts for model parameter uncertainty.

Equation discovery from data: promise and pitfalls, from rabbits to Mars

The problem of equation discovery seeks to reconstruct the underlying dynamics of a time-varying system from observations of the system, and moreover to do so in an instructive way such that we may

A Bayesian Approach for Spatio-Temporal Data-Driven Dynamic Equation Discovery

A Bayesian approach to data-driven discovery of non-linear spatio-temporal dynamic equations is developed that can accommodate measurement noise and missing data, both of which are common in real-world data, and accounts for parameter uncertainty.



Bayesian differential programming for robust systems identification under uncertainty

This paper presents a machine learning framework for Bayesian systems identification from noisy, sparse and irregular observations of nonlinear dynamical systems. The proposed method takes advantage

Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data

A variant of the SINDy algorithm that integrates automatic differentiation and recent time-stepping constrained motivated by Rudy et al is developed, which can learn a diversity of probability distributions for the measurement noise, including Gaussian, uniform, Gamma, and Rayleigh distributions.

Bayesian System ID: Optimal management of parameter, model, and measurement uncertainty

This work compares estimators of future system behavior derived from the Bayesian posterior of a learning problem to several commonly used least squares-based optimization objectives used in system ID to indicate that the log posterior has improved geometric properties compared with the objective function surfaces of traditional methods.

Dirichlet–Laplace Priors for Optimal Shrinkage

This article proposes a new class of Dirichlet–Laplace priors, which possess optimal posterior concentration and lead to efficient posterior computation.

The Bayesian Lasso

The Lasso estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters have independent Laplace (i.e., double-exponential) priors.


Under compatibility conditions on the design matrix, the posterior distribution is shown to contract at the optimal rate for recovery of the unknown sparse vector, and to give optimal prediction of the response vector.

Sparsity information and regularization in the horseshoe and other shrinkage priors

A concept of effective number of nonzero parameters is introduced, an intuitive way of formulating the prior for the global hyperparameter based on the sparsity assumptions is shown, and the previous default choices are argued to be dubious based on their tendency to favor solutions with more unshrunk parameters than the authors typically expect a priori.


This paper considers Bayesian counterparts of the classical tests for good- ness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior

Inferring Biological Networks by Sparse Identification of Nonlinear Dynamics

This method, implicit-SINDy, succeeds in inferring three canonical biological models: 1) Michaelis-Menten enzyme kinetics; 2) the regulatory network for competence in bacteria; and 3) the metabolic network for yeast glycolysis.

Spike and slab variable selection: Frequentist and Bayesian strategies

This paper introduces a variable selection method referred to as a rescaled spike and slab model, and studies the usefulness of continuous bimodal priors to model hypervariance parameters, and the effect scaling has on the posterior mean through its relationship to penalization.