Analysis of Expertise Group Using The Fuzzy K-NN Classification Algorithm (Case Study: School of Computing Telkom University)

Jodi Kusuma,Ichwanul Muslim Karo Karo,Angelina Prima Kurniati

doi:10.30865/jurikom.v9i3.4215

Jodi Kusuma, Ichwanul Muslim Karo Karo + Show 1 more

Open Access

https://doi.org/10.30865/jurikom.v9i3.4215

Copy DOI

Abstract

The School of Computing at Telkom University has four Expertise Groups that defines the lectures taken by students. Deciding the Expertise Group, will be influential in deciding elective courses and raising the topic of the Final Project. There are many students who are still having difficulty in deciding the Expertise Group and finally only decide based on the most popular Expertise Group without seeing their potential and abilities. The impact of wrong decision of the Expertise Group are delays in graduation time. It will then affect accreditation of study program and university rank, especially in the timely graduation indicator. Therefore, it is necessary to have a system that can predict the decision of the Expertise Group for the School of Computing students based on their academic scores. In this study, prediction using the Fuzzy K-Nearest Neighbor classification algorithm was chosen because it can determine the class based on the nearest neighbor and consider ambiguous data because of the weighting value in each class. There are five tests carried out to get the best model, namely (1) examine the best split training and validation data, (2) examine the best K value, (3) compare Fuzzy K-Nearest Neighbor with Naïve Bayes and Decision Tree (C4.5) which is a commonly used classification algorithm, (4) examine the values of accuracy, precision, recall, f1-score, and (5) examine the values of accuracy using Cross-Validation method. The result is that the model made using Fuzzy K-Nearest Neighbor has an accuracy value of 72% in the case of imbalance data, 62% in the case of applying the undersampling technique, and 56% in the case of applying oversampling. Based on experiments with the other two algorithms, it was found that compared to the other two algorithms, the Fuzzy K-Nearest Neighbor has a higher accuracy value in the case of imbalance data and the case of applying to undersampling, but it has a lower accuracy in the case of applying oversampling, due to the lack of Fuzzy K-Nearest Neighbor in handling small minority data variations.

Full Text