Abstract

The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.

Highlights

  • Machine learning is a branch of computer science concerned with algorithms that can develop models from the available data, reveal trends and correlation in this data, and make predictions about unavailable data

  • High Throughput Experimental Materials (HTEM) DB contains information about inorganic materials synthesized in thin film form, a combination that lends itself to high-throughput experimentation

  • The thin film sample libraries included in High Throughput Experimental Materials Database (HTEM DB) are synthesized using combinatorial physical vapor deposition (PVD) methods, and each individual sample on the library is measured using spatiallyresolved characterization techniques (Fig. 2)

Read more

Summary

Introduction

Machine learning is a branch of computer science concerned with algorithms that can develop models from the available data, reveal trends and correlation in this data, and make predictions about unavailable data. The predictions rely on data mining, the process of discovering patterns in large data sets using statistical methods. Machine learning methods have been recently successful in process automation, natural language processing, and computer vision, where large databases are available to support datadriven modeling efforts. These successes sparked discussions about the potential of ‘Artificial Intelligence’ in science[1] and ‘The Fourth Paradigm’[2] of data-driven scientific discovery. For example in advanced energy technologies, efficient solid state lighting was enabled by the use of gallium nitride in light-emitting diodes, electric cars were brought to life by intercalation materials used in lithium-ion batteries, and modern computers would not have been possible without the silicon material

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call