• Corpus ID: 235624273

Objective discovery of dominant dynamical processes with intelligible machine learning

  title={Objective discovery of dominant dynamical processes with intelligible machine learning},
  author={Bryan Kaiser and Juan A. Saenz and Maike Sonnewald and Daniel Livescu},
The advent of big data has vast potential for discovery in natural phenomena ranging from climate science to medicine, but overwhelming complexity stymies insight. Existing theory is often not able to succinctly describe salient phenomena, and progress has largely relied on ad hoc definitions of dynamical regimes to guide and focus exploration. We present a formal definition in which the identification of dynamical regimes is formulated as an optimization problem, and we propose an intelligible… 

Figures and Tables from this paper

Bridging observations, theory and numerical simulation of the ocean using machine learning
Progress within physical oceanography has been concurrent with the increasing sophistication of tools available for its study. The incorporation of machine learning (ML) techniques offers exciting
Dimensionally Consistent Learning with Buckingham Pi
An automated approach using the symmetric and self-similar structure of available measurement data to discover the dimensionless groups that best collapse this data to a lower dimensional space according to an optimal fit is proposed.
Explainable Artificial Intelligence for Bayesian Neural Networks: Towards trustworthy predictions of ocean dynamics
A Bayesian Neural Network is implemented, where parameters are distributions rather than deterministic, and novel implementations of explainable AI (XAI) techniques are applied, revealing the extent to which the BNN is suitable and/or trustworthy.


Learning dominant physical processes with data-driven balance models.
This work automates and generalizes the approach to non-asymptotic regimes by introducing the idea of an equation space, in which different local balances appear as distinct subspace clusters, and shows that this approach uncovers key mechanistic models in turbulence, combustion, nonlinear optics, geophysical fluids, and neuroscience.
Unsupervised Learning Reveals Geography of Global Ocean Dynamical Regions
Residual “dominantly nonlinear” regions highlight where the BV approach is inadequate, found in areas of rough topography in the Southern Ocean and along western boundaries.
Science and Hypothesis
RECENT speculations in mathematical physics, and acquiescence in treatment in terms of unimaginable abstractions, have raised a general question about the use of hypothesis as a means of coordinating
Learning Credible Models
This work formally defines credibility in the linear setting and proposes a regularization penalty, expert yielded estimates (EYE), that incorporates expert knowledge about well-known relationships among covariates and the outcome of interest.
Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection
An integrated framework for density-based cluster analysis, outlier detection, and data visualization is introduced, consisting of an algorithm to compute hierarchical estimates of the level sets of a density, following Hartigan’s classic model of density-contour clusters and trees.
problems and promises of deterministic extended range forecasting
The past 20 years have encompassed remarkable scientific and technical advances in the atmospheric and oceanic sciences which herald a new era for deterministically predicting atmospheric behavior.
Some methods for classification and analysis of multivariate observations
The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give
Sparse Principal Component Analysis
This work introduces a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings and shows that PCA can be formulated as a regression-type optimization problem.
Feature Selection for Unsupervised Learning
This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood.
Regression Shrinkage and Selection via the Lasso
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.