Enhancing students’ attention during lectures is crucial concern for educators. Traditionally, teachers assess student focus subjectively, which can lead to bias. We propose using neural networks to detect and assess students’ attention by analyzing their physiological factors and facial expressions. This approach can identify students losing focus and help them refocus without teacher intervention. Our proposed method involves two steps. Initially, we utilize the EduViT model with Self-Supervised Learning (SSL) to determine students’ focus level. Subsequently, we implement a system that monitors student attention using a camera mounted in the classroom. Additionally, a device attached to each student's desk displays their concentration level in three different tiers via Light-Emitting Diode (LED), alerting them if they're drifting. The EduViT model is based on the MobileViT architecture, which includes Squeeze-and-Excitation (SE) blocks to enhance data connections between channels compared to the original. To optimize the model's weight calculation and prevent data redundancy, we utilize the Self-Supervised Learning Barlow-Twin (SSL-BT) method after feature extraction on the CelebA dataset. Our proposed model achieves a maximum accuracy of 66.51% and an F1-score of 54.60% in recognizing various facial expressions in the FER2013 dataset. Finally, we export the training outcomes of our model in the “tflife” format and integrate them into a practical student attention evaluation system. 3-state LED lights attached to each student’s desk notify them of their concentration level, helping them stay focused and improve their study discipline.