Abstract

High-quality internal experimental materials data are often “small data.” Using external datasets, e.g., data from the literature and other groups, to expand the “small data” to relatively “big data” is particularly important for improving the prediction accuracy of machine learning models. However, the most critical issue is how to use a small amount of internal reliable data to filter external data extracted from multiple sources and obtain optimal datasets with a distribution similar to that of internal data. This is called the multi-source materials data problem. This question was addressed by designing an active learning-based data screening (ALDS) model that is suitable for small material samples. This study used negative expansion materials as subjects. The results show that ALDS can be used to screen the external multi-source dataset and far exceeds traditional outlier filtering methods. The average mean absolute percentage error (MAPE) of the predictive key target property of the negative thermal expansion coefficient (NTEC) was reduced from 4.301 to 0.056. Furthermore, reverse design experiments were conducted on anti-perovskite manganese nitride (AMN), a type of negative thermal expansion material, to prove that the ALDS model can guide the reverse design on a small AMN dataset using the MAPE between ALDS prediction and the ground truth of samples’ three property indicators. This achieves high confidence levels with values of 0.203,0.126, and 0.115. This verifies that the ALDS proposed method improves the effect of ”materials small internal data” in guiding material design and property prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call