Implementation of LightGBM and Random Forest in Potential Customer Classification

Laura Sari,Rostika Lityaningrum,Hety Dwi Hastuti,Annisa Romadloni

doi:10.38043/tiers.v4i1.4355

Abstract

Classification is one of the data mining techniques that can be used to determine potential custumers. Previous research show that the boosting method, especially LGBM, produces the highest accuracy value of all models, namely 100%. Meanwhile, for the two bagging methods, Random Forest produced the highest accuracy compared to Extra Trees, namely 99.03%. The research uses the LGBM and Random Forest methods to classify potential customers. The results of this study indicate that in imbalance data the LightGBM method has better accuracy than the Random Forest, which is 85.49%, when the Random Forest is unable to produce a model. The SMOTE method used in this study affects the accuracy of the random forest but does not affect the accuracy of LightGBM. Over all the Accuracy, Recall, Specificity, and Precision values, Random Forest produces a good value compared to LightGBM on balanced data. Meanwhile, LightGBM is able to handle unbalanced data.

Full Text