Penerapan Algoritma C4.5 Pada Imbalanced Dataset Untuk Memprediksi Kegagalan Angsuran Properti

Yodi Susanto,Muhammad Syafrullah,Devit Setiono

doi:10.36054/jict-ikmi.v20i2.372

Abstract

In this research, the data collection carried out by studying the patterns of consumers who fail to pay, which aimed to build a model so that it could be used in predicting customers who have the potential to fail to pay. The research used the Cross-Industry Standard Process for Data Mining (CRISP-DM) method with details of the business understanding process, data understanding, data preparation, modeling, evaluation and deployment / interpretation. The dataset in this research was taken from sales, cancellation and consumer data from January 2016 to December 2019. Because the dataset in this research was an imbalanced dataset, the researchers tried to use Synthetic Minority Oversampling Technique (SMOTE) in handling the imbalanced dataset. The research conducted a comparison of the value of accuracy, precision, recall, f measure and Area Under the ROC Curve (AUC) between the original dataset and the dataset for the addition of the SMOTE technique to several algorithms including C4.5, K-NN and Naïve Bayes. The attributes used in this research were source of funds, purpose of purchase, age, selling price, occupation, total installments, percentage of total installments, monthly installments, percentage of late installments and status. From the comparison, it was found that the C4.5 algorithm with the SMOTE 480% dataset had the highest accuracy value of 97.62%, precision of 0.976, recall of 0.976, f measure of 0.976 and AUC of 0.986 which meant Excellent Classification. From the research conducted, it was expected that the model formed on the imbalanced dataset with the C4.5 and SMOTE algorithms could be used to predict consumer installment failures.

Highlights

It was found that the C4.5 algorithm with the Synthetic Minority Oversampling Technique (SMOTE) 480% dataset had the highest accuracy value of 97.62%, precision of 0.976, recall of 0.976, f measure of 0.976 and Area Under the ROC Curve (AUC) of 0.986 which meant Excellent Classification
From the comparison, it was found that the C4.5 algorithm with the SMOTE 480% dataset had the highest accuracy value of 97.62%, precision of 0.976, recall of 0.976, f measure of 0.976 and AUC of 0.986 which meant Excellent Classification
Mengatasi Imbalance Class dalam Klasifikasi Objektivitas Berita Online Menggunakan Algoritma K-Nearest Neighbor (KNN),” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 3(2), hal

Summary

Pengumpulan Data dan Analisi

Pengumpulan data pendukung selain dilakukan dengan observasi dan dokumentasi, peneliti juga melakukan wawancara dengan pakar. Pakar dalam hal ini adalah pihak manajemen proyek properti yaitu divisi marketing dan collection yang terkait langsung dengan proses penjualan. Pihak pakar memberikan masukkan berupa beberapa atribut yang menurut pakar bisa menjadi faktor terjadinya gagal bayar

Data Collection

Data Preparation

Uji Perbandingan Algoritma

Feature Selection

Modelling

Atribut predictor 1 Label class

Model Setelah Menerapkan SMOTE

Findings

3.11 Interpretasi

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Penerapan Algoritma C4.5 Pada Imbalanced Dataset Untuk Memprediksi Kegagalan Angsuran Properti

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Jurnal ICT : Information Communication & Technology

Lead the way for us

Journal: Jurnal ICT : Information Communication & Technology	Publication Date: Dec 30, 2021
License type: CC BY 4.0

Similar Papers

Scholarship Recipients Prediction Model using k-Nearest Neighbor Algorithm and Synthetic Minority Over-sampling Technique
Dede Kurniadi ... Yoga Handoko Agustin
-
Dede Kurniadi, et. al.Dede Kurniadi ... Yoga Handoko Agustin
03 Oct 2022
03 Oct 2022

Analyzing and Processing of Supplier Database Based on the Cross-Industry Standard Process for Data Mining (CRISP-DM) Algorithm
Mohsen Jafari Nodeh ... M Hanefi Calp
-
Mohsen Jafari Nodeh, et. al.Mohsen Jafari Nodeh ... M Hanefi Calp
01 Jan 2020
01 Jan 2020

Sentiment Classification of Over-Tourism Issues in Responsible Tourism Content using Naïve Bayes Classifier
Yerik Afrianto Singgalen
Journal of Computer System and Informatics (JoSYC) | VOL. 5
Yerik Afrianto SinggalenYerik Afrianto Singgalen
20 Feb 2024
Journal of Computer System and Informatics (JoSYC) | VOL. 5

Predicting the Timeliness of Student Graduation Using Decision Tree C4.5 Algorithm in Universitas Advent Indonesia
Yusran Timur Samuel ... Bern Jonathan
-
Yusran Timur Samuel, et. al.Yusran Timur Samuel ... Bern Jonathan
01 Jul 2019
01 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Penerapan Algoritma C4.5 Pada Imbalanced Dataset Untuk Memprediksi Kegagalan Angsuran Properti

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Jurnal ICT : Information Communication & Technology