Abstract
The phenomenon of student dropouts is one of the main challenges in education, influenced by various factors such as absenteeism, economic pressures on families, low academic performance, and lack of motivation. This issue not only affects the personal development of students but also tarnishes the reputation of educational institutions. Therefore, an innovative technology-based approach, such as data mining, is needed to detect students at risk of dropping out early. This study aims to design a model for detecting the potential of school dropout students using Logistic Regression and Decision Tree methods based on student data from SMA N 4 Tegal. The variables used in the analysis include demographic, academic, and social information such as absenteeism, average semester grades, parental income, and transportation type. The dataset is processed using one-hot encoding and label encoding techniques to convert categorical data into numeric values. The results indicate that both methods have their respective advantages. The Decision Tree model achieves high precision, especially in predicting students who continue their education, with a precision of 0.99 for the "Continue School" class. However, recall for the "Dropout" class remains low (0.60), indicating the need for improvements in detecting students at risk of dropping out. On the other hand, the Logistic Regression model shows better balance in detecting both classes, with more balanced accuracy and recall. This study concludes that both models can be used to monitor the potential of school dropouts and provide data-driven recommendations for more accurate educational decision-making.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have