Band Gap Prediction for Large Organic Crystal Structures with Machine Learning

  title={Band Gap Prediction for Large Organic Crystal Structures with Machine Learning},
  author={Bart Olsthoorn and R. Matthias Geilhufe and Stanislav S. Borysov and Alexander V. Balatsky},
  journal={Advanced Quantum Technologies},
Machine‐learning models are capable of capturing the structure–property relationship from a dataset of computationally demanding ab initio calculations. Over the past two years, the Organic Materials Database (OMDB) has hosted a growing number of calculated electronic properties of previously synthesized organic crystal structures. The complexity of the organic crystals contained within the OMDB, which have on average 82 atoms per unit cell, makes this database a challenging platform for… 

Machine learning the quantum-chemical properties of metal–organic frameworks for accelerated materials discovery

The first publicly available quantum-chemical database for MOFs is developed (the “QMOF database”), which consists of properties derived from density functional theory (DFT) for over 14,000 experimentally synthesized MOFs, and it is demonstrated how this new database can be used to identify MOFs with targeted electronic structure properties.

Machine Learning-Based Prediction of Crystal Systems and Space Groups from Inorganic Materials Compositions

This work proposes and evaluates machine-learning algorithms for determining the structure type of materials, given only their compositions and demonstrates that RF with Magpie features generally outperforms other algorithms for binary and multiclass prediction of crystal systems and space groups, while MLP with atom frequency features is the best one for structural polymorphism prediction.

Predicting band gaps and band-edge positions of oxide perovskites using DFT and machine learning.

Density functional theory within the local or semilocal density approximations (DFT-LDA/GGA) has become a workhorse in electronic structure theory of solids, being extremely fast and reliable for

Machine Learning in Matter at Different Scales

  • Computer Science
  • 2019
A ML workflow capable of parametrizing Heisenberg Hamiltonians for new materials, bypassing the most demanding part of the process is proposed, showcasing the universality of ML as a tool.

Graph Neural Network for Hamiltonian-Based Material Property Prediction

This work presents and compares several different graph convolution networks that are able to predict the band gap for inorganic materials and shows that the model can get a promising prediction accuracy with cross-validation.

Auto-generated database of semiconductor band gaps using ChemDataExtractor

This work presents an auto-generated database of 100,236 semiconductor band gap records, extracted from 128,776 journal articles with their associated temperature information, which is the largest open-source non-computational band gap database to date.

Machine learning approach to genome of two-dimensional materials with flat electronic bands

Many-body physics of electron-electron correlations plays a central role in condensed mater physics, it governs a wide range of phenomena, stretching from superconductivity to magnetism, and is

Data-Driven Design of a New Organic Semiconductor via an Electronic Structure Chart

Data-driven methodologies for designing new materials are developing apace, yet advances for organic crystals have been infrequent. For organic crystals, the need to predict solid-state electronic

Spin wave excitations of magnetic metalorganic materials

The Organic Materials Database (OMDB) is an open database hosting about 22,000 electronic band structures, density of states and other properties for stable and previously synthesized 3-dimensional



Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties.

A crystal graph convolutional neural networks framework to directly learn material properties from the connection of atoms in the crystal, providing a universal and interpretable representation of crystalline materials.

PubChemQC Project: A Large-Scale First-Principles Electronic Structure Database for Data-Driven Chemistry.

The fundamental features of the PubChemQC database are shown and the techniques used to construct the data set for large-scale quantum chemistry calculations are discussed and a machine learning approach to predict the electronic structure of molecules is presented as an example to demonstrate the suitability of the large- scale quantum chemistry database.

Unified Representation of Molecules and Crystals for Machine Learning

A many-body tensor representation that is invariant to translations, rotations and nuclear permutations of same elements, unique, differentiable, can represent molecules and crystals, and is fast to compute is introduced.

Machine Learning Predictions of Molecular Properties: Accurate Many-Body Potentials and Nonlocality in Chemical Space

A systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules and is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space.

A Data-Driven Construction of the Periodic Table of the Elements

This work shows how one can generalize the SOAP kernel to introduce a distance-dependent weight that accounts for the multi-scale nature of the interactions, and a description of correlations between chemical species, to improve substantially the performance of ML models of molecular and materials stability.

Theory and Practice of Atom-Density Representations for Machine Learning

This work introduces an abstract definition of chemical environments that is based on a smoothed atomic density, using a bra-ket notation to emphasise basis set independence and to highlight the connections with some popular choices of descriptors for describing atomic systems.

SchNet - A deep learning architecture for molecules and materials.

The deep learning architecture SchNet is presented that is specifically designed to model atomistic systems by making use of continuous-filter convolutional layers and employs SchNet to predict potential-energy surfaces and energy-conserving force fields for molecular dynamics simulations of small molecules.

Atom-density representations for machine learning.

An abstract definition of chemical environments that is based on a smoothed atomic density is introduced, using a bra-ket notation to emphasize basis set independence and to highlight the connections with some popular choices of representations for describing atomic systems.

Organic materials database: An open-access online database for data mining

This work illustrates the use of the OMDB and how it can become an organic part of search and prediction of novel functional materials via data mining techniques, and provides data mining results for metals and semiconductors, known to be rare in the class of organic materials.

Comparing molecules and solids across structural and alchemical space.

This work discusses how one can combine such local descriptors using a regularized entropy match (REMatch) approach to describe the similarity of both whole molecular and bulk periodic structures, introducing powerful metrics that enable the navigation of alchemical and structural complexities within a unified framework.