Identification of crystal structures is a crucial stage in the exploration of novel functional materials. This procedure is usually time-consuming and can be false-positive or false-negative. This necessitates a significant level of expert proficiency in the field of crystallography and, especially, requires deep experience in perovskite-related structures of hybrid perovskites. Our work is devoted to the machine learning classification of structure types of hybrid lead halides based on available X-ray diffraction data. Here, we proposed a simple approach for quickly identifying the dimensionality of inorganic substructures, types of connections of lead halide polyhedra and structure types using common powder XRD data and a ML-decision tree classification model. The average accuracy of our ML algorithm in predicting the dimensionality of inorganic substructures, the type of connection of lead halide and inorganic substructure topology based on theoretically calculated XRD patterns among 14 most common structure types reached 0.76 ± 0.07, 0.827 ± 0.028 and 0.71 ± 0.05, respectively. To test the transferability of the developed ML model, we expanded our dataset to 30 structure types. The average accuracy of our ML algorithm in predicting the dimensionality of inorganic substructures, the type of connection of lead halide and inorganic substructure topology based on theoretically calculated XRD patterns among 30 structure types reached 0.820 ± 0.022, 0.74 ± 0.05 and 0.633 ± 0.018, respectively. The validation of our decision tree classification ML model on experimental XRD data shows accuracies of 1.0 and 0.82 for dimension and structure type prediction. Thus, our approach can significantly simplify and accelerate the interpretation of highly complicated XRD data for hybrid lead halides.
Read full abstract