Abstract
Educational data mining is an emerging field in data mining. The need for accurate in identifying student accomplishment on a course or maybe an upcoming course can help the institution to build technology-aided education better. Educational data mining becoming a more important field to be studied because of its potential to produce a knowledge base model to help even the teacher or lecturer. Like another classification task, educational data mining has a common and frequently discovered problem. The problem that occurred in educational data mining specifically and classification tasks generally is an imbalanced class problem. An imbalanced class is a condition where the distribution of each class is not in the same proportion. In this research, it is found that the class distribution is severely imbalanced and it is a multiclass dataset that consists of more than two class labels. According to the problem stated beforehand, this paper will focus on the imbalanced class handling and classification with several methods on both of it such as Linear Regression, Random Forest and Stacking for classification and SMOTE, ADASYN, and SMOTE-ENN for the resampling algorithm. The methods are being evaluated using a 10-fold cross-validation and an 80-20 splitting ratio. The result shows that the best performance coming from the Stacking classification on ADASYN resampled dataset evaluated using an 80-20 splitting ratio with a 0.97 F1 score. The result of this study also shows that the resampling technique improves classification performance. Even though the no-resampling classification result produced a decent result too, it can be caused by several things such as the general pattern of the data for each class is already been good from the start. Thus, there is no real drawbacks if the original data is processed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.