Global economic development has led to the high complexity of society's needs. Financial institutions are here to provide facilities to meet the increasingly complex needs of society. However, the existence of problem loans can be a serious threat so classification techniques in data mining are used to overcome this problem. This research develops a model that can predict customers' ability to make credit payments so that financial institutions can avoid problematic credit. In this research, the SMOTE resampling technique is used to see the effect of sampling in dealing with class imbalance and conducting credit assessments. The research results show that the model built using SMOTE has better AUC than the model without SMOTE. From the two machine learning algorithms, logistic regression and random forest, the results show that the random forest model with SMOTE has the best performance with an accuracy value of 90%, precision of 92%, recall of 88%, F1-score of 90%, and AUC value of 0.97. Based on the best model, ten important features were obtained that influence the process of assessing credit repayment capabilities, namely the normalized score from external data sources, the period for changing customer numbers, the number of previous installment payments, the customer's age, registration time, the period for applying for credit at the credit bureau, the period for changing identity documents, the time for updating information at the credit bureau, and the length of time the customer has worked. In addition, this research produces visualizations via dashboards that can be used to improve the process of assessing credit repayment capabilities.
Read full abstract