• Corpus ID: 237532175

MOFSimplify: Machine Learning Models with Extracted Stability Data of Three Thousand Metal-Organic Frameworks

  title={MOFSimplify: Machine Learning Models with Extracted Stability Data of Three Thousand Metal-Organic Frameworks},
  author={Aditya Nandy and Guillermo Terrones and N. Arunachalam and Chunyan Duan and David W Kastner and Heather J. Kulik},
We report a workflow and the output of a natural language processing (NLP)-based procedure to mine the extant metal-organic framework (MOF) literature describing structurally characterized MOFs and their solvent removal and thermal stabilities. We obtain over 2,000 solvent removal stability measures from text mining and 3,000 thermal decomposition temperatures from thermogravimetric analysis data. We assess the validity of our NLP methods and the accuracy of our extracted data by comparing to a… 

Figures and Tables from this paper

Audacity of huge: overcoming challenges of data scarcity and data quality for machine learning in computational materials discovery
Increasingly sophisticated natural language processing and automated image analysis are making it possible to learn structure-property relationships from the literature, and ML models trained on these data sets will improve as they incorporate community feedback.
Realizing the data-driven, computational discovery of metal-organic framework catalysts
Metal–organic frameworks (MOFs) have been widely investigated for challenging catalytic transformations due to their well-defined structures and high degree of synthetic tunability. These features,


Using Machine Learning and Data Mining to Leverage Community Knowledge for the Engineering of Stable Metal-Organic Frameworks
This work extracts thousands of published reports of the key aspects of MOF stability necessary for their practical application: the ability to withstand high temperatures without degrading and the capacity to be activated by removal of solvent molecules.
Text Mining Metal-Organic Framework Papers
A simple text mining algorithm that allows us to identify surface area and pore volumes of metal-organic frameworks (MOFs) using manuscript html files as inputs is developed.
Discovering Relationships between OSDAs and Zeolites through Data Mining and Generative Neural Networks
A data-driven approach to unearth generalized OSDA–zeolite relationships using a comprehensive database comprising of 5,663 synthesis routes for porous materials and adapts a generative neural network capable of suggesting new molecules as potential OSDAs for a given zeolite structure and gel chemistry.
Learning from Failure: Predicting Electronic Structure Calculation Outcomes with Machine Learning Models.
The first machine learning models that predict the likelihood of successful simulation outcomes and a metric of model uncertainty based on the distribution of points in the latent space to systematically improve model prediction confidence are introduced.
Structure-Mechanical Stability Relations of Metal-Organic Frameworks via Machine Learning
The overarching mechanical screening approach presented here reveals the sensitivity on structural parameters such as topology, coordination characteristics and the nature of the building blocks, and paves the way for computational as well as experimental researchers to assess and design MOFs with enhanced mechanical stability to accelerate the translation of MOFs to industrial applications.
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning
In the past several years, Materials Genome Initiative (MGI) efforts have produced myriad examples of computationally designed materials in the fields of energy storage, catalysis, thermoelectrics,
A Quantitative Uncertainty Metric Controls Error in Neural Network-Driven Chemical Discovery
Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours.
Prediction of water stability of metal-organic frameworks using machine learning
An efficient and instant machine learning (ML)-based strategy for screening water-stable MOFs and the applicability and merit of the surrogate models developed in this work are strongly suggested.
Resolving Transition Metal Chemical Space: Feature Selection for Machine Learning and Structure-Property Relationships.
A series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph are introduced to make these RACs amenable to inorganic chemistry.
A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction
This paper creates natural language processing techniques and text markup parsing tools to automatically extract synthesis information and trends from zeolite journal articles and engineer a data set of germanium-containing zeolites to test the accuracy of the extracted data and to discover potential opportunities for zeolitic morphologies.