Objective: To develop a communication system for people with speech difficulties that allows them to express their needs by issuing instructions to the computer with minimal eye blinks, using a model created with MediaPipe and Deep Learning techniques. Theoretical Framework: The research is based on concepts of eye position tracking, convolutional networks and media pipe technology, with a focus on applying it to the communication needs of people with speech difficulties. Method: Qualitative and exploratory study, convolutional networks and media pipe techniques were used. To create the dataset, web scraping techniques were combined with manual image collection, the model was trained by comparing the performance of two CNN architectures. Results and Discussion: The incorporation of AI in the eye blink detection process is relatively recent, with more publications since 2020. It was found that the system is capable of processing facial gestures in real time with an average delay of 0.5 seconds, users reported improvements in their ability to communicate independently and reducing the effort their relatives had to make to interpret their needs, an accuracy of 94.5% was achieved in standard lighting conditions and 92% in variable conditions. Research Implications: The research reveals how AI with the incorporation of continuously emerging methods can improve the task of detecting images for eye tracking, obtaining increasingly better results in precision. Originality/Value: The application of emerging AI techniques in eye tracking to apply it in the development of a system that helps people with speech problems communicate.
Read full abstract