Abstract

According to WHO, the number of people with disabilities in the world has exceeded 1 billion. At the same time, 80 percent of all people with disabilities live in developing countries. In this regard, the demand for the use of applications for people with disabilities is growing every day. The paper deals with neural network methods like MediaPipe Holistic and the LSTM module for determining the sign language of people with disabilities. MediaPipe has demonstrated unprecedented low latency and high tracking accuracy in real-world scenarios thanks to built-in monitoring solutions. Therefore, MediaPipe Holistic was used in this work, which combines pose, hand, and face control with detailed levels. The main purpose of this paper is to show the effectiveness of the HAR algorithm for recognizing human actions, based on the architecture of in-depth learning for classifying actions into seven different classes. The main problem of this paper is the high level of recognition of the sign language of people with disabilities when implementing their work in cross-platform applications, web applications, social networks that facilitate the daily life of people with disabilities and interact with society. To solve this problem, an algorithm was used that combines the architecture of a convolutional neural network (CNN) and long short-term memory (LSTM) to study spatial and temporal capabilities from three-dimensional skeletal data taken only from a Microsoft Kinect camera. This combination allows you to take advantage of LSTM when modeling temporal data and CNN when modeling spatial data. The results obtained based on calculations carried out by adding a new layer to the existing model showed higher accuracy than calculations carried out on the existing model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call