Abstract

Dropout rates on Massive Open Online Courses (MOOCs) are still very high. One effort to reduce the number of dropouts is by interfering at-risk students. An accurate and high consistency prediction system is needed. Machine learning method is the most popular in handling this case. The focus of machine learning that ensures the ability of models in generalizing knowledge makes overfitting a critical issue in supervised learning. Thus, a single classifier may fail to classify correctly. Besides, the inherent class imbalance between dropout (majority class) and non-dropout (minority) makes it difficult to build robust predictions. To overcome this problem, an ensemble learning (EL) is combined with synthetic minority over-sampling technique (SMOTE). The SMOTE-Ensemble Learning (SEL) is done by majority voting from three base machine learning methods: Logistic Regression (LR), K-Nearest Neighbor (KNN) and Random Forest (RF). The result on the KDDCUP2015 dataset shows that this combination able to improve prediction performance with an average improvement of the harmonic mean of precision and recall (F1-score): 7.74% compared to previous work.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.