Abstract

Applying for a loan at a bank has a series of relevant assessments based on data and credit scores in determining a borrower's eligibility to receive a loan from the bank. Machine learning is the basis for evaluating whether an individual is worthy of obtaining a loan, in order to reduce the potential risks faced by banks. This research aims to obtain the best accuracy value from the Loan Approval Prediction dataset which is sourced from the open dataset provider website, namely Kaggle. This Loan Approval Prediction dataset has 14 features with 4,269 data. The results of dataset analysis carried out on 4,269 data showed that the amount of data that could be studied was 4,173 data (2,599 data were approved and 1,574 data were rejected). The results of the feature importance evaluation on 14 features show that loan amount is the most important feature compared to other features, while bank asset value is the feature that has the lowest influence. Research on the Loan Approval Prediction dataset was also carried out by testing several Decision Tree ensemble models, including Extreme Gradient Boosting or XGBoost, Light Gradient Boosting Machine (Light GBM), Gradient Boosting, Random Forest, Adaptive Boosting (Adaboost) and Extra Trees. The comparison results show that the XGBoost (Extreme Gradient Boosting) model is the best model, with Accuracy 0.9974, AUC 0.9998, Recall 0.9963, Prec 0.9969, F1 0.9966.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call