The increased intervention of computer vision in enhancing learning environments has intrigued modern literature on educational research. Affective tutoring system (ATS), student emotion recognition system (SERS), sentiment analysis, and multi-agent system (MAS) have significantly enhanced the online learning environment. Without preceding technologies, online learning environment growth was found to be redundant due to a lack of a corrective mechanism in marginalized viewpoints of tutor in noting facts for their learning environment. As a result, computer vision was found to enhance the online learning environment by providing tutors with enhanced monitoring ability and assistive measures. The following study incorporated a systematic review of 33 papers, efficiently reflecting technology usage in education. The study examined methods for extracting and analyzing emotional data, machine learning algorithms, flexibility to individual learners, camera quality, and ethical issues. This study sought to illuminate these systems' pros and cons. This research shows that emotion recognition systems can improve online learning. These technologies quantify and analyze students' emotional responses, helping educators improve teaching techniques and material. With real-time input on students' emotions, teachers can adjust their methods to keep students engaged and increase academic performance.