In this paper, we present a machine learning-based approach that leverages Long Short-Term Memory (LSTM) networks combined with a sliding window technique for feature extraction, aimed at accurately predicting point defect percentages in semiconductor materials based on simulated X-ray Diffraction (XRD) data. The model was initially trained on silicon-simulated XRD data with defect percentages ranging from 1 to 5%, enabling it to predict defect percentages from 0 to 10% in silicon and other semiconductor materials, including AlAs, CdS, GaAs, Ge, and ZnS. Through extensive experimentation, we explored different sequence lengths and LSTM units, identifying the optimal configuration as a sequence length of 3501 and 4500 units, which yielded the best results. The model’s mean absolute error at 4500 units was 0.021, the lowest among the LSTM configurations tested. The sliding window technique plays a crucial role in capturing temporal dependencies within the XRD data, allowing the model to generalize to other semiconductor materials. Additionally, we observed that increasing defect percentages consistently led to a rise in background intensity. We further examined the relationship between crystal structure and defect precentage predictions, uncovering consistent trends for materials with Diamond Cubic and Zinc Blende structures. This LSTM-based method offers a novel approach to predicting defect percentages using simulated XRD patterns of materials.
Read full abstract