Learning physical descriptors for materials science by compressed sensing

  title={Learning physical descriptors for materials science by compressed sensing},
  author={Luca M. Ghiringhelli and Jan Vyb{\'i}ral and Emre Ahmetcik and Runhai Ouyang and Sergey V. Levchenko and Claudia Draxl and Matthias Scheffler},
  journal={New Journal of Physics},
The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other recent methods from applied mathematics, computer science, statistics, signal processing, and… 

Simultaneous learning of several materials properties from incomplete databases with multi-task SISSO

This work describes a powerful extension of the SISSO methodology to a ‘multi-task learning’ approach, which identifies a single descriptor capturing multiple target materials properties at the same time, specifically suited for a heterogeneous materials database with scarce or partial data.

SISSO: A compressed-sensing method for identifying the best low-dimensional descriptor in an immensity of offered candidates

The sure independence screening and sparsifying operator (SISSO) tackles immense and correlated features spaces, and converges to the optimal solution from a combination of features relevant to the materials' property of interest.

Recent advances and applications of machine learning in solid-state materials science

A comprehensive overview and analysis of the most recent research in machine learning principles, algorithms, descriptors, and databases in materials science, and proposes solutions and future research paths for various challenges in computational materials science.

From DFT to machine learning: recent approaches to materials science–a review

This review follows a logical sequence starting from density functional theory as the representative instance of electronic structure methods, to the subsequent high-throughput approach, used to generate large amounts of data.

Nanoinformatics, and the big challenges for the science of small things.

The combination of computational chemistry and computational materials science with machine learning and artificial intelligence provides a powerful way of relating structural features of

Exploring Two-Dimensional Materials Thermodynamic Stability via Machine Learning.

This work uses machine learning techniques to identify thermodynamically stable 2D materials, which is the first essential requirement for any application, and demonstrates the usefulness of the model generating more than a thousand novel compounds.

Machine learning in materials informatics: recent applications and prospects

This article attempts to provide an overview of some of the recent successful data-driven “materials informatics” strategies undertaken in the last decade, with particular emphasis on the fingerprint or descriptor choices.

Data-driven descriptor for high-throughput screening of topological insulators

Significant advances have been made in predicting new topological materials using high-throughput empirical descriptors or symmetry-based indicators. This line of research has produced extensive

Impact of atomistic or crystallographic descriptors for classification of gold nanoparticles.

This study compares results of supervised and unsupervised learning on a single set of gold nanoparticles that has been characterised by two different descriptors, each with a unique feature space.



Data mining for materials: Computational experiments with AB compounds

Three materials research relevant tasks, namely, separation of a number of compounds into subsets in terms of their crystal structure, grouping of an unknown compound into the most characteristically similar peers, and specific property prediction (the melting point) are explored.

Compressive sensing as a paradigm for building physics models

CS is a powerful paradigm for model building; it is shown that its models are more physical and predict more accurately than current state-of-the-art approaches and can be constructed at a fraction of the computational cost and user effort.

Accelerating materials property predictions using machine learning

It is shown that fingerprints based on either chemo-structural (compositional and configurational information) or the electronic charge density distribution can be used to make ultra-fast, yet accurate, property predictions.

Materials Cartography: Representing and Mining Material Space Using Structural and Electronic Fingerprints

The issue of scientific discovery in materials databases is addressed by introducing novel analytical approaches based on structural and electronic materials fingerprints, which contribute to the emerging field of materials informati...

Machine Learning Strategy for Accelerated Design of Polymer Dielectrics

This work addresses the issue of accelerating polymer dielectrics design by extracting learning models from data generated by accurate state-of-the-art first principles computations for polymers occupying an important part of the chemical subspace.

A Mathematical Introduction to Compressive Sensing

A Mathematical Introduction to Compressive Sensing gives a detailed account of the core theory upon which the field is build and serves as a reliable resource for practitioners and researchers in these disciplines who want to acquire a careful understanding of the subject.

How to represent crystal structures for machine learning: Towards fast prediction of electronic properties

It is found that conventional representations of the input data, such as the Coulomb matrix, are not suitable for the training of learning machines in the case of periodic solids and proposes a novel crystal structure representation for which learning and competitive prediction accuracies become possible within an unrestricted class of spd systems of arbitrary unit-cell size.

Finding Nature’s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory

Finding new compounds and their crystal structures is an essential step to new materials discoveries. We demonstrate how this search can be accelerated using a combination of machine learning

Compressed modes for variational problems in mathematics and physics

This article describes a general formalism for obtaining spatially localized solutions to a class of problems in mathematical physics, which can be recast as variational optimization problems, such as the important case of Schrödinger’s equation in quantum mechanics.