Loan prediction plays an important role in the process of evaluating loan applications by financial institutions. Machine learning models can automate this process and make the lending process faster and more efficient. In this context, the main objective of this research is to develop models for loan approval prediction using machine learning algorithms such as Logistic Regression, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Random Forest and to compare their performances. In addition, determining the effect of K-Best and Recursive Feature Elimination feature selection methods on model performances is another important objective of the research. Furthermore, the evaluation of the effectiveness of techniques such as cross-validation (K-Fold) and Train, Test and Validation in measuring the performance of models is also among the objectives of the research. The findings revealed that married individuals are more likely to be approved for loans than single individuals, high income individuals more likely than low-income individuals, males more likely than females, and university graduates more likely than non-university graduates. According to the performance measures, Random Forest was the most successful algorithm with an accuracy rate of 97.71% in loan approval prediction. To achieve this accuracy rate, feature selection was performed with the Recursive Feature Elimination method and the measurement was made with the cross-validation method. It was found that the feature selection methods have a significant impact on the model performances and the Recursive Feature Elimination method was the most successful method. Moreover, the highest accuracy rate achieved by the Random Forest algorithm, which showed the highest performance in all cases, was measured by cross-validation.
Read full abstract