Abstract
Nowadays, Google Forms is becoming a cutting-edge tool for gathering research data in the educational domain. Several researchers are using real-time web applications to collect the responses of respondents. Demographic and geographic features are the most important in the researcher’s study. Identifying students’ demographics (gender, age-group, course, institution, or university) and geographic features (locality and country) is a challenging problem in machine learning. We proposed a novel predictive algorithm, Student Demographic Identification (SDI), to identify a student’s demographic features (age-group, course) with the highest accuracy. SDI has been tested on primary reliable samples. SDI has also been compared with the traditional machine algorithms Random Forest (RF), and Logistic Regression (LR), and Radial Support Vector Machine (R–SVM). The proposed algorithm significantly improved the performance metrics such as accuracy, F1-score, precision, recall, and Matthews Correlation Coefficient (MCC) of these classifiers. We also proposed significant features to identify students’ age-group, course, and gender. SDI has identified the student’s age group with an accuracy of 96% and the course with an accuracy of 97%. Gradient Boosting (GB) has improved the accuracy of LR, R-SVM, and RF to predict the student’s gender. Also, the RF algorithm with the support of GB attained the highest accuracy of 98% to identify the gender of the students. All three classifiers have also identified the student’s locality and institution with an identical accuracy of 99%. Our proposed SDI algorithm may be useful for real-time survey applications to predict students’ demographic features.
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have