Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems

  title={Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems},
  author={John A. Keith and Valentin Vassilev-Galindo and Bingqing Cheng and Stefan Chmiela and Michael Gastegger and Klaus-Robert M{\"u}ller and Alexandre Tkatchenko},
  journal={Chemical Reviews},
  pages={9816 - 9872}
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This Review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and… Expand
Deep integration of machine learning into computational chemistry and materials science
Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in theExpand
Computational discovery of energy materials in the era of big data and machine learning: A critical review
  • Ziheng Lu
  • Computer Science
  • Materials Reports: Energy
  • 2021
In this report, recent advances in material discovery methods are reviewed for energy devices and three paradigms based on empiricism-driven experiments, database-driven high-throughput screening, and data informatics-driven machine learning are discussed critically. Expand
Benchmarking graph neural networks for materials chemistry
Graph neural networks (GNNs) have received intense interest as a rapidly expanding class of machine learning models remarkably well-suited for materials applications. To date, a number of successfulExpand
SE(3)-equivariant prediction of molecular wavefunctions and electronic densities
Machine learning has enabled the prediction of quantum chemical properties with high accuracy and efficiency, allowing to bypass computationally costly ab initio calculations. Instead of training onExpand
Operator-induced structural variable selection with applications to materials genomes
We propose a new method for variable selection with operator-induced structure (OIS), in which the predictors are engineered from a limited number of primary variables and a set of elementaryExpand
Quantum Mechanics Enables "Freedom of Design" in Molecular Property Space
Rational design of molecules with targeted properties requires understanding quantum-mechanical (QM) structure-property/property-property relationships (SPR/PPR) across chemical compound space. WeExpand
Inverse design of 3d molecular structures with conditional generative neural networks
This research presents a probabilistic architecture for machine learning that automates the very labor-intensive and therefore time-heavy and therefore expensive process of training neural networks. Expand
Topological Characterization and Graph Entropies of Tessellations of Kekulene Structures: Existence of Isentropic Structures and Applications to Thermochemistry, Nuclear Magnetic Resonance, and Electron Spin Resonance.
The developed techniques can be applied in the general context of artificial intelligence for the machine generation of nuclear magnetic resonance and electron spin resonance spectroscopic patterns as well as in robust computations of thermochemistry of a large combinatorial libraries of tessellations of kekulenes through the generation of bond-equivalence classes. Expand
Optimal Sampling Density for Nonparametric Regression
The proposed active learning method outperforms the existing state-of-the-art model-agnostic approaches and factorizes the influence of local function complexity, noise level and test density in a transparent and interpretable way. Expand
Accurate large-scale simulations of siliceous zeolites by neural network potentials
The tremendous diversity of zeolite frameworks makes ab initio simulations of their structure, stability, reactivity and properties virtually impossible. To enable large-scale reactive simulations ofExpand


Machine learning the ropes: principles, applications and directions in synthetic chemistry.
Different approaches of representing and utilizing organic molecules will be discussed - providing synthetic chemists both with the understanding and the tools required to apply machine learning in the context of their research, and pointers for further studying. Expand
Machine learning in chemoinformatics and drug discovery.
Basic principles and recent case studies are presented to demonstrate the utility of machine learning techniques in chemoinformatics analyses; and limitations and future directions are discussed to guide further development in this evolving field. Expand
Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems.
A review of the current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab-initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling. Expand
Perspective: Energy Landscapes for Machine Learning
This Perspective aims to describe analogies analogous to molecular structure, thermodynamics, and kinetics with examples from recent applications, and suggest avenues for new interdisciplinary research. Expand
Machine learning for molecular and materials science
A future in which the design, synthesis, characterization and application of molecules and materials is accelerated by artificial intelligence is envisaged. Expand
Predicting reaction performance in C–N cross-coupling using machine learning
It is demonstrated that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation and provides significantly improved predictive performance over linear regression analysis. Expand
Machine learning for heterogeneous catalyst design and discovery
Advances in machine learning (ML) are making a large impact in many fields, including: artificial intelligence, materials science, and chemical engineering. Generally, ML tools learn from data toExpand
Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies.
A number of established machine learning techniques are outlined and the influence of the molecular representation on the methods performance is investigated, finding the best methods achieve prediction errors of 3 kcal/mol for the atomization energies of a wide variety of molecules. Expand
Deep learning for computational chemistry
This review provides an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics, and highlights its ubiquity and broad applicability to a wide range of challenges in the field. Expand
Neural Networks for the Prediction of Organic Chemistry Reactions
This work explores the use of neural networks for predicting reaction types, using a new reaction fingerprinting method and combines this predictor with SMARTS transformations to build a system which, given a set of reagents and reactants, predicts the likely products. Expand