This investigation elucidates the paramount endeavour of predicting loan defaults, which is imperative for the efficacious management of financial risk and the overall stability of financial institutions. Conventional statistical methodologies frequently encounter challenges in effectively capturing the nonlinear and sequential dynamics inherent in financial data, thereby necessitating the examination of more sophisticated machine learning methodologies. This research reports an experimental-based comparative evaluation of three ML and DL models—Long Short-Term Memory (LSTM) networks, Random Forest (RF), and Support Vector Regression (SVR)—to assess their efficacy in forecasting loan defaults. The models are evaluated using metrics such as Mean Squared Error (MSE), F1 score, and Accuracy, and their proficiency in addressing imbalanced datasets and elucidating intricate data relationships is highlighted. The results indicate that while the Random Forest model surpasses its counterparts in terms of accuracy and MSE, the LSTM model exhibits considerable potential in managing imbalanced data, as evidenced by its stable F1 score. Although SVR reveals competitive precision, it exhibits deficiencies in addressing class imbalance. The ANOVA analyses substantiate that the disparities in model performance are statistically significant. The research acknowledges that both the LSTM and SVR models remain in the developmental stages, with ongoing initiatives aimed at refining these models through hyperparameter optimization and advanced architectural frameworks to enhance their predictive efficacy in practical applications.
Read full abstract