Abstract

With scientific research in materials science becoming more data intensive and collaborative after the announcement of the Materials Genome Initiative, the need for modern data infrastructures that facilitate the sharing of materials data and analysis tools is compelling in the materials community. In this paper, we describe the challenges of developing such infrastructure and introduce an emerging architecture with high usability. We call this architecture the Materials Genome Engineering Databases (MGED). MGED provides cloud-hosted services with features to simplify the process of collecting datasets from diverse data providers, unify data representation forms with user-centered presentation data model, and accelerate data discovery with advanced search capabilities. MGED also provides a standard service management framework to enable finding and sharing of tools for analyzing and processing data. We describe MGED’s design, current status, and how MGED supports integrated management of shared data and services.

Highlights

  • The materials community is acknowledging that the availability of vast data resources carries the potential to answer questions previously out of reach

  • The Citrination platform has developed a hierarchical data structure called the Physical Information File that can accommodate complex materials data, ensuring that they are human searchable and machine readable for data mining[45]. These infrastructures provide a standardized data format to reduce the heterogeneity in the stored data, but enables only technical experts to manipulate these data formats due to the introduction of complex data types and structures. After considering these previous efforts, we believe that the development of modern data infrastructure for Materials Genome Engineering (MGE) will hinge on two main technical requirements corresponding to integrated management of shared data and services: Architecture

  • With these requirements in mind, we have developed the Materials Genome Engineering Databases (MGED), which is an DATA COLLECTING SYSTEM

Read more

Summary

INTRODUCTION

The materials community is acknowledging that the availability of vast data resources carries the potential to answer questions previously out of reach. (2) The infrastructure needs to provide a service management framework of capabilities to integrate with various services and tools for analyzing and processing data, and cooperate with databases seamlessly, which enables service discovery and data reuse With these requirements in mind, we have developed the Materials Genome Engineering Databases (MGED), which is an DATA PROVIDER. When the framework is fully developed and the integration process standard has been established, MGED will be open to all researchers in materials community and collaborates with them in development and integration of useful tools that improve data utilization, which promotes service sharing process and accelerates materials discovery

DISCUSSION
METHODS
CODE AVAILABILITY
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call