Abstract. With the continuous development of the global economy, the demand for loans from individuals and enterprises is growing. However, loan defaults have gradually become a major challenge facing the financial industry. Loan defaults not only directly affect the profitability of financial institutions, but may also trigger systemic risks and pose a potential threat to the entire economic system. Therefore, improving the accuracy of loan default prediction is crucial for financial institutions to effectively manage credit risks. This study built a system for personal loan default prediction by combining TabNet with the Logistic regression model. Through feature engineering, this study extracts potential credit risk features by utilizing datasets from personal and Internet loans. The accuracy of default prediction is improved by combining TabNet's feature learning capabilities with Logistic Regression's interpretability. An AUC value of 0.89 was achieved by the integrated model, which achieved a notable performance. The results indicate that a system that relies on machine learning to predict defaults can significantly enhance the quality of credit approval decisions and lower the likelihood of bad debts for financial institutions. Future studies could aim to optimize the feature selection process, experiment with more advanced machine learning algorithms, or apply the model to diverse loan datasets, thereby enhancing both its generalization and accuracy. In conclusion, this study offers a novel approach to loan default prediction, demonstrating significant practical value and providing substantial support for the risk management and decision-making processes of financial institutions.
Read full abstract