This paper introduces a novel hybrid machine learning model that combines Long Short-Term Memory (LSTM) networks and SHapley Additive exPlanations (SHAP) to enhance bug localization across multiple software platforms. The aim is to adapt to the variability inherent in different operating systems and provide transparent, interpretable results for software developers. Our methodology includes comprehensive preprocessing of bug report data using advanced natural language processing techniques, followed by feature extraction through word embeddings to accommodate the sequential nature of text data. The LSTM model is trained and evaluated on a dataset of simulated bug reports, with the results interpreted using SHAP values to ensure clarity in decision-making. The results demonstrate the model’s robustness, adaptability, and consistent performance across platforms, as evidenced by accuracy, precision, recall, and F1 scores. The dataset's distribution of bug categories and statuses further provides valuable insights into common software development issues.
Read full abstract