Tracking materials science data lineage to manage millions of materials experiments and analyses

Edwin Soedarmadji,John M Gregoire,Helge S Stein,Santosh K Suram,Dan Guevarra

doi:10.1038/s41524-019-0216-x

Abstract

In an era of rapid advancement of algorithms that extract knowledge from data, data and metadata management are increasingly critical to research success. In materials science, there are few examples of experimental databases that contain many different types of information, and compared with other disciplines, the database sizes are relatively small. Underlying these issues are the challenges in managing and linking data across disparate synthesis and characterization experiments, which we address with the development of a lightweight data management framework that is generally applicable for experimental science and beyond. Five years of managing experiments with this system has yielded the Materials Experiment and Analysis Database (MEAD) that contains raw data and metadata from millions of materials synthesis and characterization experiments, as well as the analysis and distillation of that data into property and performance metrics via software in an accompanying open source repository. The unprecedented quantity and diversity of experimental data are searchable by experiment and analysis attributes generated by both researchers and data processing software. The search web interface allows users to visualize their search results and download zipped packages of data with full annotations of their lineage. The enormity of the data provides substantial challenges and opportunities for incorporating data science in the physical sciences, and MEAD’s data and algorithm management framework will foster increased incorporation of automation and autonomous discovery in materials and chemistry research.

Highlights

Its user interface is optimized for retrieving data rather than data input and data management
Photoelectrochemical performance is assessed via a series of specialized scanning droplet cells.[44,45]
The range of experimental techniques is ever evolving but as soon as a technique is not a one-off experiment or is intended to be run on a regular basis in-house the pipeline is amended to the specific needs of the new technique

Summary

Introduction

The critical role of materials in many technologies, combined with the opportunity for accelerating materials discovery and optimization via modern data science, motivates a transformation in how materials information is generated, stored, and retrieved,[1,2,3,4,5,6] a transformation that is well underway in other research fields.[7,8,9,10,11,12,13] Historically, the only way to retrieve fundamental properties of mostly “simple” materials (the elements and some binary phases) involved a manual lookup in the seminal materials databases, such as CRC materials table,[14] the Landolt–Börnstein[15] collection, or the ASM phase diagram table.[16]. The exploration of vast, high-dimensional composition spaces motivates the establishment of new data management protocols for organizing and disseminating the materials data. Computational materials databases such as Materials Project,[20] OQMD,[21] and AFLOW22 have pioneered this effort for virtual materials, and the recent release of the High Throughput Experimental Materials (HTEM)[6] and the present work comprise an important advances in data management and dissemination of materials experiments, highlighting the importance and challenges of metadata management in experimental materials science. The Materials Experiment and Analysis Database[32] (MEAD) facilitates retrieval of the experiments that were performed on a given material and the ensuing analysis that generated the inferred materials properties

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: npj Computational Materials	Publication Date: Jul 26, 2019
Citations: 42	License type: open-access

R Discovery Prime

R Discovery Prime

Tracking materials science data lineage to manage millions of materials experiments and analyses

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: npj Computational Materials

Lead the way for us

Similar Papers

The role of primary motor cortex in goal-directed movements: insights from neurophysiological studies on non-human primates
Stephen H Scott
Current Opinion in Neurobiology | VOL. 13
Stephen H ScottStephen H Scott
14 Nov 2003
Current Opinion in Neurobiology | VOL. 13

What Could We Learn from Another Civilization?
Brian S Mcconnell
-
Brian S McconnellBrian S Mcconnell
01 Jan 2020
01 Jan 2020

A national survey of child forensic interviewers: Implications for research, practice, and law.
Melanie B Fessinger ... Bradley D Mcauliff
Law and Human Behavior | VOL. 44
Melanie B Fessinger, et. al.Melanie B Fessinger ... Bradley D Mcauliff
01 Apr 2020
Law and Human Behavior | VOL. 44

Information System for Storage, Management, and Usage for Embodied Intelligent Systems
Daniel Beßler ... Michael Beetz
-
Daniel Beßler, et. al.Daniel Beßler ... Michael Beetz
20 Nov 2019
20 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Tracking materials science data lineage to manage millions of materials experiments and analyses

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: npj Computational Materials