Abstract

Feature selection is an integral process for feature engineering prior to deep learning (DL) model development. The idea is to reduce complexities of high - dimensional data structures by keeping only relevant information in the data mining process. The critical part in developing a DL model to predict student performance is the high - dimensionality of students’ profiles which results in a DL model with low performance metrics. Students’ profile/data involves different aspects such as demographic information, academic records, technological resources, social attitudes, family background and/or socio – economic status. Empirically, the diversity of these data produce complexity in terms of dimension. In this paper, we compared the effectiveness of four feature selection algorithms (Information Gain Based, ReliefF, Boruta and Recursive Feature Elimination) on deep learning models using an educational dataset from Portugal. The effectiveness is measured using the following model performance metrics: training accuracy, validation accuracy, testing accuracy, kappa statistic, and f-measure. Results revealed the robustness of the Boruta algorithm in dimensionality reduction as it allowed the deep learning model to achieve its highest performance metrics compared to the utilization of other feature selection algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.