Abstract

Diabetes is a disease caused by high blood sugar in the body or beyond normal limits. Diabetics in Indonesia have experienced a significant increase, Basic Health Research states that diabetics in Indonesia were 6.9% to 8.5% increased from 2013 to 2018 with an estimated number of sufferers more than 16 million people. Therefore, it is necessary to have a technology that can detect diabetes with good performance, accurate level of analysis, so that diabetes can be treated early to reduce the number of sufferers, disabilities, and deaths. The different scale values for each attribute in Gula Karya Medika’s data can complicate the classification process, for this reason the researcher uses two data normalization methods, namely min-max normalization, z-score normalization, and a method without data normalization with Random Forest (RF) as a classification method. Random Forest (RF) as a classification method has been tested in several previous studies. Moreover, this method is able to produce good performance with high accuracy. Based on the research results, the best accuracy is model 1 (Min-max normalization-RF) of 95.45%, followed by model 2 (Z-score normalization-RF) of 95%, and model 3 (without data normalization-RF) of 92%. From these results, it can be concluded that model 1 (Min-max normalization-RF) is better than the other two data normalization models and is able to increase the performance of classification Random Forest by 95.45%.

Highlights

  • a disease caused by high blood sugar in the body

  • Basic Health Research states that diabetics in Indonesia were

  • so that diabetes can be treated early to reduce the number of sufferers

Read more

Summary

Pendahuluan

Diabetes merupakan salah satu penyakit yang disebabkan karena gula darah di dalam tubuh yang tinggi atau melampaui batas normal. Penelitian terhadap penyakit besar dibandingan normalisasi z-score dengan diabetes sudah dilakukan dengan menggunakan perbandingan 88.09%:78.56%, membuktikan bahwa berbagai macam metode klasifikasi untuk mendeteksi min-max dapat meningkatkan hasil akurasi diabetes. Selanjutnya pada tahun 2020 Diniyal Amru Agatsa [3], membangun model klasifikasi pasien pengidap diabetes menggunakan metode Support Vector Machine pada data diabetes dan validasi model menggunakan K-Fold Cross Validation untuk membagi data menjadi k bagian dengan hasil akurasi yang diperoleh sebesar 77,92%. Selanjutnya Indrayanti tahun 2017 [4], peneliti menggunakan KNN sebagai metode klasifikasi untuk mengklasifikasi penyakit diabetes melitus, dengan hasil akurasi yang diperoleh sebesar 75,14% dengan nilai k=13 merupakan nilai k yang paling optimal. Pada tahap klasifikasi dibandingkan hasil akurasi yang diperoleh Random Forest dari Min-max normalization, Z-score normalization, dan tanpa normalisasi data untuk mengetahui metode normalisasi data mana yang lebih optimal dan akurat dalam meningkatkan performansi klasifikasi penyakit diabetes.

Metode Penelitian
Data Cleaning
Dataset Diabetes
Kesimpulan
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.