Abstract
In the paper, the problem of face recognition in a video stream with augmented reality is considered. The current state of this problem is investigated. The general process of face recognition and the basic concepts of augmented reality have been studied. The analysis of modern approaches to solving the face recognition problem is carried out, the strengths and weaknesses of the methods used have been found. A search is carried out for a method invariant to scaling, scene changes, head turns, changes in lighting, accessories, and changes in emotions. An algorithm, architecture, and the soft-ware system that solves the problem of face recognition in a video stream with the elements of aug-mented reality have been developed. A histogram of oriented gradients (HOG) is chosen as the method for face detection; face recognition functionality is developed on the basis of the convolutional neural network architecture – ResNet34. Experimental studies are carried out, the system has been tested on both one and several faces simultaneously. Estimate methods of the recognition quality for the developed software system are determined – the plotting of ROC-curves that show the dependence of the number of false positives on the detection accuracy (true positive rate) and measuring AUC. AUC =0.95 has been achieved during recognition of one face, and AUC = 0.83 – during recognition of sev-eral faces (maximum 4). Statement of the problem. Investigation and analysis of existing approaches to building face-to-face recognition technology in augmented reality systems by analyzing models, methods and algorithms for human face recognition, identifying strengths and weaknesses of existing solutions, choosing the best combination of detection and recognition methods. Analysis of recent research and publications. Approaches have been proposed for the formation of biometric face image templates that can be used for biometric verification or face identification. However, all recent facial recognition results have been obtained through the use of deep convolutional neural networks. In the work of Yu. V. Visilter et al., a convolutional neural network with a hash forest (ZNMHL), based on a convolutional network with a hash layer, has been obtained. J. Betty et al. have investigated how different factors influence recognition quality. The purpose of the article is to prove the effectiveness of the proposed approach based on the histogram of directed gradients and convolutional neural network architecture ResNet34 for the problem of face recognition in a video stream in augmented reality systems. Presenting main material. The basic concepts of augmented reality are analyzed. The process of face recognition is described. The description of the software for the solution of the problem is given and mathematical model is developed. The algorithm of work of the program of face recognition program developed by authors is detailed. The architecture of face recognition system in augmented reality video stream is designed. Results. A software system designed to recognize human faces in augmented reality video streams has shown satisfactory results. The application correctly recognizes the face, available in the database, in different conditions of lighting, head rotation, with the presence of accessory ditches, the closure of some parts of the face, changes in emotions, etc., similarly for recognizing multiple faces at the same time. The system has been tested on 520 examples: 4 people separately and together in different combinations under different conditions of lighting, noise, interference, accessories, emotions. Conclusion. Applying a neural network to the ResNet architecture with appropriate settings for detecting and recognizing human faces in augmented reality video streams is a good choice – this method is invariant to scaling, scene changes, head turns, light changes, accessories, and emotion changes. The system is a platform for further development. In particular, it is planned to conduct experimental studies using other methods of face recognition in a video stream and to perform a comparative analysis of the results, as well as to create a more convenient graphical interface of the program and adaptation for the mobile version.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Вісник Черкаського державного технологічного університету
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.