Abstract

The rapid expansion of video conferencing and remote works due to the COVID-19 pandemic has resulted in a massive volume of video data to be analyzed in order to understand the audience engagement. However, analyzing this data efficiently, particularly in real-time, poses a scalability challenge as online events can involve hundreds of people and last for hours. Existing solutions, especially open-sourced contributions, usually require dedicated and expensive hardware, and are designed as centralized cloud systems. Additionally, they may also require users to stream their video to remote servers, which raises privacy concerns. This paper introduces scalable and efficient computer vision algorithms for analyzing face orientation and eye blink in real-time on edge devices, including Android, iOS, and Raspberry Pi. An example solution is presented for proctoring online meetings, workplaces, and exams. It analyzes audiences on their own devices, thus addressing scalability and privacy issues, and runs at up to 30 fps on a Raspberry Pi. The proposed face orientation detection algorithm is extremely simple, efficient, and able to estimate the head pose in two degrees of freedom, horizontal and vertical. The proposed Eye Aspect Ratio (EAR) with simple adaptive threshold demonstrated a significant improvement in terms of false positives and overall accuracy compared to the existing constant threshold method. Additionally, the algorithms are implemented and open sourced as a toolkit with modular, cross-platform MediaPipe Calculators and Graphs so that users can easily create custom solutions for a variety of purposes and devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call