NOMAD: The FAIR concept for big data-driven materials science

  title={NOMAD: The FAIR concept for big data-driven materials science},
  author={Claudia Draxl and Matthias Scheffler},
  journal={MRS Bulletin},
<jats:p><jats:fig position="anchor"><jats:graphic xmlns:xlink="" orientation="portrait" mime-subtype="jpeg" mimetype="image" position="float" xlink:type="simple" xlink:href="S0883769418002087_figAb" /></jats:fig></jats:p> 

Figures from this paper

Data-centric science for materials innovation
This issue of MRS Bulletin focuses on the numerous efforts in developing and utilizing databases of electronic structure calculations, and their impact on addressing different classes of problems in materials science.
The evolving landscape for alloy design
<jats:p><jats:fig position="anchor"><jats:graphic xmlns:xlink="" orientation="portrait" mime-subtype="gif" mimetype="image" position="float" xlink:type="simple"
Big Data-Driven Materials Science and Its FAIR Data Infrastructure
This chapter addresses the forth paradigm of materials research -- big-data driven materials science. Its concepts and state-of-the-art are described, and its challenges and chances are discussed.
Global Research on Big Data in Relation with Artificial Intelligence (A Bibliometric Study: 2008-2019)
The research concentrates and highlights the current issues discussed and studied by the scholars around the globe and shows the publication trends on big data in relation with artificial intelligence research outcomes in highly reputable SCI-Exp and SSCI journal (ranked by WoS).
CateCom: a practical data-centric approach to categorization of computational models
This work applies object-oriented design concepts and outlines the foundations of an open-source collaborative framework capable of uniquely describing the approaches in structured data, flexible enough to cover the majority of widely used models, and utilizes collective intelligence through community contributions.
FAIR and Interactive Data Graphics from a Scientific Knowledge Graph
This pairing of SPARQL and Vega-Lite—demonstrated here in the domain of polymer nanocomposite materials science—offers an extensible approach to FAIR (findable, accessible, interoperable, reusable) scientific data visualization within a knowledge graph framework.
Managing FAIR Tribological Data Using Kadi4Mat
This work demonstrates the versatility of the open source research data infrastructure Kadi4Mat by managing and producing FAIR tribological data and shows a practical bottom-up approach and how such infrastructures are an essential part of the authors' FAIR digital future.
Data‐Driven Materials Science: Status, Challenges, and Perspectives
The historical development and current state of data‐driven materials science, building from the early evolution of open science to the rapid expansion of materials data infrastructures are discussed, providing a perspective on the future development of the field.
Tracking materials science data lineage to manage millions of materials experiments and analyses
The Materials Experiment and Analysis Database (MEAD) is a database that contains raw data and metadata from millions of materials synthesis and characterization experiments, as well as the analysis and distillation of that data into property and performance metrics via software in an accompanying open source repository.
Envisioning data sharing for the biocomputing community
A coordinated initiative is proposed, focusing on the computational biophysics and biochemistry community but general and flexible in its defining characteristics, which aims at addressing the growing necessity of collecting, rationalizing, sharing and exploiting the data produced in this scientific environment.


Towards efficient data exchange and sharing for big-data driven materials science: metadata and data formats
A key element of this work is the definition of hierarchical metadata describing state-of-the-art electronic-structure calculations, which was agreed upon by two teams and is presented in this perspective paper.
Big data of materials science: critical role of the descriptor.
A trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful when the scientific connection between the descriptor and the actuating mechanisms is unclear.
Building the Materials Innovation Infrastructure: Data and Standards
Acknowledgments This report summarizes the results of the Workshop " A Materials Genome Initiative Workshop – Building the Materials Innovation Infrastructure: Data and Standards " held May14-15,
Commentary: The Materials Project: A materials genome approach to accelerating materials innovation
Accelerating the discovery of advanced materials is essential for human welfare and sustainable, clean energy. In this paper, we introduce the Materials Project (, a core
Learning physical descriptors for materials science by compressed sensing
A compressed-sensing based methodology for feature selection, specifically for discovering physical descriptors, i.e., physical parameters that describe the material and its properties of interest, and associated equations that explicitly and quantitatively describe those relevant properties.
Uncovering structure-property relationships of materials by subgroup discovery
Subgroup discovery is presented here as a data-mining approach to help find interpretable local patterns, correlations, and descriptors of a target property in materials-science data with data generated by density-functional theory calculations.
The high-throughput highway to computational materials design.
A current snapshot of high-throughput computational materials design is provided, and the challenges and opportunities that lie ahead are highlighted.
Machine learning in materials informatics: recent applications and prospects
This article attempts to provide an overview of some of the recent successful data-driven “materials informatics” strategies undertaken in the last decade, with particular emphasis on the fingerprint or descriptor choices.
From core referencing to data re-use: two French national initiatives to reinforce paleodata stewardship (National Cyber Core Repository and LTER France Retro-Observatory)
ROZA was developed under the umbrella of LTER-France (Long Term Ecological Research) in order to facilitate the re-use of data and samples and will favor to use of paleodata by non-paleodata scientists, in particular ecologists.
Insightful classification of crystal structures using deep learning
This study uses machine learning to automatically classify more than 100,000 simulated perfect and defective crystal structures, paving the way for crystal structure recognition of—possibly noisy and incomplete—three-dimensional structural data in big-data materials science.