# Objective discovery of dominant dynamical processes with intelligible machine learning

@article{Kaiser2021ObjectiveDO, title={Objective discovery of dominant dynamical processes with intelligible machine learning}, author={Bryan Kaiser and Juan A. Saenz and Maike Sonnewald and Daniel Livescu}, journal={ArXiv}, year={2021}, volume={abs/2106.12963} }

The advent of big data has vast potential for discovery in natural phenomena ranging from climate science to medicine, but overwhelming complexity stymies insight. Existing theory is often not able to succinctly describe salient phenomena, and progress has largely relied on ad hoc definitions of dynamical regimes to guide and focus exploration. We present a formal definition in which the identification of dynamical regimes is formulated as an optimization problem, and we propose an intelligible…

## Figures and Tables from this paper

## 3 Citations

Bridging observations, theory and numerical simulation of the ocean using machine learning

- Environmental ScienceEnvironmental Research Letters
- 2021

Progress within physical oceanography has been concurrent with the increasing sophistication of tools available for its study. The incorporation of machine learning (ML) techniques offers exciting…

Dimensionally Consistent Learning with Buckingham Pi

- Computer ScienceArXiv
- 2022

An automated approach using the symmetric and self-similar structure of available measurement data to discover the dimensionless groups that best collapse this data to a lower dimensional space according to an optimal fit is proposed.

Explainable Artificial Intelligence for Bayesian Neural Networks: Towards trustworthy predictions of ocean dynamics

- Computer ScienceArXiv
- 2022

A Bayesian Neural Network is implemented, where parameters are distributions rather than deterministic, and novel implementations of explainable AI (XAI) techniques are applied, revealing the extent to which the BNN is suitable and/or trustworthy.

## References

SHOWING 1-10 OF 45 REFERENCES

Learning dominant physical processes with data-driven balance models.

- PhysicsNature communications
- 2021

This work automates and generalizes the approach to non-asymptotic regimes by introducing the idea of an equation space, in which different local balances appear as distinct subspace clusters, and shows that this approach uncovers key mechanistic models in turbulence, combustion, nonlinear optics, geophysical fluids, and neuroscience.

Unsupervised Learning Reveals Geography of Global Ocean Dynamical Regions

- Environmental ScienceEarth and space science
- 2019

Residual “dominantly nonlinear” regions highlight where the BV approach is inadequate, found in areas of rough topography in the Southern Ocean and along western boundaries.

Science and Hypothesis

- PsychologyNature
- 1929

RECENT speculations in mathematical physics, and acquiescence in treatment in terms of unimaginable abstractions, have raised a general question about the use of hypothesis as a means of coordinating…

Learning Credible Models

- Computer ScienceKDD
- 2018

This work formally defines credibility in the linear setting and proposes a regularization penalty, expert yielded estimates (EYE), that incorporates expert knowledge about well-known relationships among covariates and the outcome of interest.

Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection

- Computer ScienceACM Trans. Knowl. Discov. Data
- 2015

An integrated framework for density-based cluster analysis, outlier detection, and data visualization is introduced, consisting of an algorithm to compute hierarchical estimates of the level sets of a density, following Hartigan’s classic model of density-contour clusters and trees.

problems and promises of deterministic extended range forecasting

- Environmental Science
- 1969

The past 20 years have encompassed remarkable scientific and technical advances in the atmospheric and oceanic sciences which herald a new era for deterministically predicting atmospheric behavior.…

Some methods for classification and analysis of multivariate observations

- Mathematics
- 1967

The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give…

Sparse Principal Component Analysis

- Computer Science
- 2006

This work introduces a new method called sparse principal component analysis (SPCA) using the lasso (elastic net) to produce modified principal components with sparse loadings and shows that PCA can be formulated as a regression-type optimization problem.

Feature Selection for Unsupervised Learning

- Computer ScienceJ. Mach. Learn. Res.
- 2004

This paper explores the feature selection problem and issues through FSSEM (Feature Subset Selection using Expectation-Maximization (EM) clustering) and through two different performance criteria for evaluating candidate feature subsets: scatter separability and maximum likelihood.

Regression Shrinkage and Selection via the Lasso

- Computer Science
- 1996

A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.