Geometric information such as the space groups and crystal systems plays an important role in the properties of crystal materials. Prediction of crystal system and space group thus has wide applications in crystal material property estimation and structure prediction. Previous works on experimental X-ray diffraction (XRD) and density functional theory (DFT) based structure determination methods achieved outstanding performance, but they are not applicable for large-scale screening of materials compositions. There are also machine learning models using Magpie descriptors for composition based material space group determination, but their prediction accuracy only ranges between 0.638 and 0.907 in different kinds of crystals. Herein, we report an improved machine learning model for predicting the crystal system and space group of inorganic materials using only the formula information. Benchmark study on a dataset downloaded from Materials Project Database shows that our random forest models based on our new descriptor set, achieve significant performance improvements compared with previous work with accuracy scores ranging between 0.712 and 0.961 in terms of space group classification. Our model also shows large performance improvement for crystal system prediction. Trained models and source code are freely available athttps://github.com/Yuxinya/SG_predict
Read full abstract