As a crucial form of non-verbal communication, facial expressions can convey an individual’s emotional state through a combination of facial muscle movements and provide an effective approach to understanding emotional shifts among group members to a certain extent. However, in real-time collaborative environments, the increasing influx of audiovisual information can overwhelm an individual’s limited visual attention, making it challenging to observe and analyze other group members’ facial expressions continually. The failure to promptly recognize and interpret facial expressions can disrupt individual and group-level comprehension and evaluation of emotional dynamics during collaborative interactions, consequently hindering subsequent emotion management and communication. This article delves into the classification of facial expression information and its formation mechanism to thoroughly analyze the relationship between group members’ emotional perceptions and expressions in the collaboration process. Leveraging the YOLO-FaceV2 face detection model, FaceNet face recognition model, and CERN expression recognition model, we present a novel Collaborative Emotion Analysis Framework (CEAF) for multi-person facial expression recognition. By employing Web real-time communication technology, this framework is integrated into an online group meetings system, which can identify the faces and expressions of the participants and provide real-time visual analysis of their expression information. After the conclusion of the meeting, the system can perform an emotional evaluation with pre-defined operators, providing invaluable insights into the emotional dynamics of the group and individual members throughout the collaborative process. Ultimately, validation through an online meeting instance indicates that this system can facilitate groups in interpreting emotional changes among collaborative group members to a considerable extent.