Abstract

Directly applying big language models for material and molecular design is not straightforward, particularly for real-scenario cases, where experimental validation accuracy is required. In this study, we propose a multimode descriptor design method for materials prediction and analysis, leveraging the advantages of the natural language processing literature model and density functional theory (DFT) calculations with the assistance of the genetic algorithm (GA). A case study on prediction of aqueous photocurrents of multisolvent engineered halide perovskite CH3NH3PbI3 is performed, and the following-up validation experiments are carried out to demonstrate the improved accuracy of the multimode descriptors (an unprecedented experimental validation accuracy of 87.5% via the GA is achieved) for predicting aqueous photocurrents of perovskite materials (c.f. only 50% experimental accuracy for other common machine learning models). The improved experimental accuracy of the descriptors is attributed to the successful deployment of a language model incorporating concise scientific information from >1 million articles into molecular descriptors in combination with DFT calculations. The subsequent machine learning analysis suggests the importance of cation···π and crystallization in molecule-modified halide perovskite materials representing ontological and conceptual understanding. Importantly, the genetic process affords an accurate "white-box" model to describe the perovskite stability (accuracy = 90.2% for the test data set and 92.3% for the train data set) with the mathematical equation , where F1 ∼ F5 atomic-level structural and chemical details such as cation···π interactions and highest occupied molecular orbital levels. This study offers a feasible descriptor design route to accurately predict complex material properties, leveraging both language models and density functional theories.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.