Faux-Data Injection Optimization for Accelerating Data-Driven Discovery of Materials

Abdul Wahab Ziaullah,Fedwa El-Mellouhi,Sanjay Chawla

doi:10.1007/s40192-023-00301-x

Abdul Wahab Ziaullah, Fedwa El-Mellouhi + Show 1 more

Open Access

https://doi.org/10.1007/s40192-023-00301-x

Copy DOI

Abstract

Artificial intelligence is now extensively being used to optimize and discover novel materials through data-driven search. The search space for the material to be discovered is usually so large, that it renders manual optimization impractical. This is where data-driven search and optimization enables us to resourcefully locate an optimal or acceptable material configuration with desirable target properties. One such prominent data-driven optimization technique is Bayesian optimization (BO). Among the mechanics of a BO is the use of a machine learning (ML) model that learns about the scope of the problem through data being acquired on the fly. In this way a BO becomes more informative, directing the search more exquisitely by providing informative suggestions for locating a suitable material candidate for further evaluation. The candidate material is suggested by proposing parameters such as its composition and configuration, which are then evaluated either by physically synthesizing the material and testing its properties or through computational methods such as through density functional theory (DFT). DFT enables researchers to exploit massively parallel architectures such as high-performance computing (HPC) which a traditional BO might not be able to fully leverage due to their typical sequential data-acquisition bottleneck. Here, we tackle such shortcomings of BO and maximize the utilization of HPC by enabling BO to suggest multiple candidate material suggestions for DFT evaluations at once, which can then be distributed in multiple compute nodes of an HPC. We achieve this objective through a batch optimization technique based on faux-data injection in the BO loop. In the approach at each candidate suggestion from a typical BO loop, we “predict” the outcome, instead of running the actual experiment or DFT calculation, forming a “faux-data-point” and injecting it back to update an ML model. The next BO suggestion is therefore conditioned on the actual data as well as faux-data, to yield the next candidate data-point suggestion. The objective of this methodology is to simulate a time-consuming sequential data-gathering process and approximate the next k-potential candidates, quickly. All these k-potential candidates can then be distributed to run in parallel in an HPC. Our objective in this work is to test the theory if faux-data injection methodology enables us accelerate our data-driven material discovery workflow. To this end, we execute computational experiments by utilizing organic–inorganic halide perovskites as a case study since the optimality of the results can be easily verified from our previous work. To evaluate the performance, we propose a metric that considers and consolidates acceleration along with the quality of the results such as the best value reached in the process. We also utilize a different performance indicator for situations where the desired outcome is not material with optimal properties but rather a material whose properties satisfy some minimum requirements. We use these performance indicators to compare this BO-based faux-data injection method (FDI-BO) with different baselines. The results show that based on our design constraints, the FDI-BO approach enabled us to obtain around two- to sixfold acceleration on average compared to the sequential BO.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Faux-Data Injection Optimization for Accelerating Data-Driven Discovery of Materials

Abstract

Talk to us

Similar Papers

More From: Integrating Materials and Manufacturing Innovation

Lead the way for us

Journal: Integrating Materials and Manufacturing Innovation	Publication Date: Jun 1, 2023
License type: CC BY 4.0

Similar Papers

Machine learning with knowledge constraints for process optimization of open-air perovskite solar cell manufacturing
Zhe Liu ... Austin C Flick
Joule | VOL. 6
Zhe Liu, et. al.Zhe Liu ... Austin C Flick
01 Apr 2022
Joule | VOL. 6

BO-Stacking: A novel shear strength prediction model of RC beams with stirrups based on Bayesian Optimization and model stacking
Jiangpeng Shu ... Yuanfeng Duan
Structures | VOL. 58
Jiangpeng Shu, et. al.Jiangpeng Shu ... Yuanfeng Duan
17 Nov 2023
Structures | VOL. 58

Synergy of machine learning and density functional theory calculations for predicting experimental Lewis base affinity and Lewis polybase binding atoms.
Hieu Huynh ... Stefano Forli
Journal of computational chemistry | VOL. 45
Hieu Huynh, et. al.Hieu Huynh ... Stefano Forli
18 Mar 2024
Journal of computational chemistry | VOL. 45

Predicting band gaps of ABN3 perovskites: an account from machine learning and first-principle DFT studies.
Swarup Ghosh ... Joydeep Chowdhury
RSC Advances | VOL. 14
Swarup Ghosh, et. al.Swarup Ghosh ... Joydeep Chowdhury
01 Jan 2024
RSC Advances | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Faux-Data Injection Optimization for Accelerating Data-Driven Discovery of Materials

Abstract

Talk to us

Similar Papers

More From: Integrating Materials and Manufacturing Innovation