Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering

Indrawata Wardhana,Vandri Ahmad Isnaini Vandri Ahmad Isnaini,Rahmi Putri Wirman Rahmi Putri Wirman,Musi Ariawijaya Musi Ariawijaya

doi:10.29207/resti.v6i1.3682

Indrawata Wardhana, Vandri Ahmad Isnaini Vandri Ahmad Isnaini + Show 2 more

Open Access

https://doi.org/10.29207/resti.v6i1.3682

Copy DOI

Abstract

Bean seed classification is critical in determining the quality of beans. Previously, the same dataset was tested using the MLP, SVM, KNN, and DT algorithms, with SVM producing the best results. The purpose of this study is to determine the most effective model through the use of the BoxCox transformation selection feature and the random forest (RF) algorithm, as well as the gradient boosting machine (GBM), light GBM, and repeated k-folds evaluation model. The bean dataset is available on the UCI Repository website. The BoxCox transformation and repeated k-folds improved the classification prediction's accuracy. The model is used in the optimal training phase for a random forest with decision tree parameters 50 and depth 10, a gradient boosting machine model with a learning rate of 1, and a light gradient boosting machine model with a learning rate of 0.5 and estimator of 500. The best training accuracy results are obtained with light GBM. which is 99 percent accurate, but only 91 percent accurate in terms of validation. According research, the Barbunya, Bombay, Cali, Dermason, Horoz, Seker, and Sira beans classes provided accuracy values of 91 percent, 100 percent, 92 percent, 92 percent, 95 percent, 94 percent, and 84 percent, respectively.

Highlights

Bean seed classification is critical in determining the quality of beans
The bean dataset is available on the UCI Repository website
The model is used in the optimal training phase for a random forest

Summary

Pendahuluan

Penentuan klasifikasi biji-bijian merupakan faktor yang penting sekali dalam menentukan mutu biji-bijian dan telah banyak dilakukan dengan berbagai metode oleh para ahli. Metode analisis serta perhitungan pada machine learning dan image recognition kacang kering dapat diidentifikasi berdasarkan panjang, bentuk, besar, dan aspek fisik lainnya. Penelitian didapatkan bahwa GBM meningkatkan akurasi prediksi R kuadrat dan RMSE lebih dari 80 persen dibandingkan dengan model terbaik industri yakni algoritma random forest dan regresi linier [21]. Pada prediksi miRNA penderita kanker payudara, menggunakan beberapa teknik machine learning yakni XGBoost, Random Forest, dan lightGBM , diperoleh bahwa LightGBM dari beberapa aspek seperti akurasi dan kecepatan unggul dari dua teknik lainnya [26]. Pada penelitian ini melakukan komparasi terhadap akurasi prediksi pada tiga algoritma gradient boosting machine, random forest dan Light GBM menggunakan fitur seleksi BoxCox. Komparasi ini akan diuji pada klasifikasi dataset kacang kering

Dataset Kacang

Normalisasi data BoxCox

Evaluasi

Algoritma Klasifikasi

Peralatan

Korelasi Variabel

Model Training

BoxCox Repeated k-folds

Random Forest Salah satu parameter untuk optimasi dari metode Random

Light GBM

Findings

Kesimpulan

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)

Lead the way for us

Journal: Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)	Publication Date: Feb 27, 2022
License type: CC BY 4.0

Similar Papers

Machine Learning Prediction of Liver Allograft Utilization From Deceased Organ Donors Using the National Donor Management Goals Registry.
Andrew M Bishara ... Claus U Niemann
Transplantation direct | VOL. 7
Andrew M Bishara, et. al.Andrew M Bishara ... Claus U Niemann
27 Sep 2021
Transplantation direct | VOL. 7

Synthetic Tabular Data Based on Generative Adversarial Networks in Health Care: Generation and Validation Using the Divide-and-Conquer Strategy.
Ha Ye Jin Kang ... Kui Son Choi
JMIR medical informatics | VOL. 11
Ha Ye Jin Kang, et. al.Ha Ye Jin Kang ... Kui Son Choi
24 Nov 2023
JMIR medical informatics | VOL. 11

Productivity prediction in the Wolfcamp A and B using weighted voting ensemble machine learning method
Sungil Kim ... Kwang Hyun Kim
Gas Science and Engineering | VOL. 111
Sungil Kim, et. al.Sungil Kim ... Kwang Hyun Kim
03 Feb 2023
Gas Science and Engineering | VOL. 111

Evaluating the performance of machine learning methods and variable selection methods for predicting difficult-to-measure traits in Holstein dairy cattle using milk infrared spectral data
Lucio F.M Mota ... Alessio Cecchinato
Journal of dairy science | VOL. 104
Lucio F.M Mota, et. al.Lucio F.M Mota ... Alessio Cecchinato
15 Apr 2021
Journal of dairy science | VOL. 104

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gradient Boosting Machine, Random Forest dan Light GBM untuk Klasifikasi Kacang Kering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)