The COVID-19 pandemic has significantly changed learning processes. Learning, which had generally been carried out face-to-face, has now turned online. This learning strategy has both advantages and challenges. On the bright side, online learning is unbound by space and time, allowing it to take place anywhere and anytime. On the other side, it faces a common challenge in the lack of direct interaction between educators and students, making it difficult to assess students’ engagement during an online learning process. Therefore, it is necessary to conduct research with the aim of automatically detecting students’ engagement during online learning. The data used in this research were derived from the DAiSEE dataset (Dataset for Affective States in E-Environments), which comprises ten-second video recordings of students. This dataset classifies engagement levels into four categories: low, very low, high, and very high. However, the issue of imbalanced data found in the DAiSEE dataset has yet to be addressed in previous research. This data imbalance can cause errors in the classification model, resulting in overfitting and underfitting of the model. In this study, Convolutional Neural Network, a deep learning model, was utilized for feature extraction on the DAiSEE dataset. The OpenFace library was used to perform facial landmark detection, head pose estimation, facial expression unit recognition, and eye gaze estimation. The pre-processing stages included data selection, dimensional reduction, and normalization. The PCA and SVD techniques were used for dimensional reduction. The data were later oversampled using the SMOTE algorithm. The training and testing data were distributed at an 80:20 ratio. The results obtained from this experiment exceeded the benchmark evaluation values on the DAiSEE dataset, achieving the best accuracy of 77.97% using the SVD dimensional reduction technique.