The bandgap is an inherent property of semiconductors and insulators, significantly influencing their electrical and optical characteristics. However, theoretical calculations using the density functional theory (DFT) are time-consuming and underestimate bandgaps. Machine learning offers a promising approach for predicting bandgaps with high precision and high throughput, but its models face the difficulty of being hard to interpret. Hence, an application of explainable artificial intelligence techniques to the bandgap prediction models is necessary to enhance the model's explainability. In our study, we analyzed the support vector regression, gradient boosting regression, and random forest regression models for reproducing the experimental and DFT bandgaps using the permutation feature importance (PFI), the partial dependence plot (PDP), the individual conditional expectation plot, and the accumulated local effects plot. Through PFI, we identified that the average number of electrons forming covalent bonds and the average mass density of the elements within compounds are particularly important features for bandgap prediction models. Furthermore, PDP visualized the dependency relationship between the characteristics of the constituent elements of compounds and the bandgap. Particularly, we revealed that there is a dependency where the bandgap decreases as the average mass density of the elements of compounds increases. This result was then theoretically interpreted based on the atomic structure. These findings provide crucial guidance for selecting promising descriptors in developing high-precision and explainable bandgap prediction models. Furthermore, this research demonstrates the utility of explainable artificial intelligence methods in the efficient exploration of potential inorganic semiconductor materials.
Read full abstract