Abstract

In face recognition systems, highly robust facial feature representation and good classification algorithm performance can affect the effect of face recognition under unrestricted conditions. To explore the anti-interference performance of convolutional neural network (CNN) reconstructed by deep learning (DL) framework in face image feature extraction (FE) and recognition, in the paper, first, the inception structure in the GoogleNet network and the residual error in the ResNet network structure are combined to construct a new deep reconstruction network algorithm, with the random gradient descent (SGD) and triplet loss functions as the model optimizer and classifier, respectively, and it is applied to the face recognition in Labeled Faces in the Wild (LFW) face database. Then, the portrait pyramid segmentation and local feature point segmentation are applied to extract the features of face images, and the matching of face feature points is achieved using Euclidean distance and joint Bayesian method. Finally, Matlab software is used to simulate the algorithm proposed in this paper and compare it with other algorithms. The results show that the proposed algorithm has the best face recognition effect when the learning rate is 0.0004, the attenuation coefficient is 0.0001, the training method is SGD, and dropout is 0.1 (accuracy: 99.03%, loss: 0.0047, training time: 352 s, and overfitting rate: 1.006), and the algorithm proposed in this paper has the largest mean average precision compared to other CNN algorithms. The correct rate of face feature matching of the algorithm proposed in this paper is 84.72%, which is higher than LetNet-5, VGG-16, and VGG-19 algorithms, the correct rates of which are 6.94%, 2.5%, and 1.11%, respectively, but lower than GoogleNet, AlexNet, and ResNet algorithms. At the same time, the algorithm proposed in this paper has a faster matching time (206.44 s) and a higher correct matching rate (88.75%) than the joint Bayesian method, indicating that the deep reconstruction network algorithm proposed in this paper can be used in face image recognition, FE, and matching, and it has strong anti-interference.

Highlights

  • FR technology has been extensively adopted in identity recognition, but it is mainly used to detect biological features in the face for recognition, with strong uniqueness and security [1]

  • Existing studies have shown that, with the gradual deepening of the convolutional neural network (CNN) structure, the CNN training results become better, but while improving the results, it will increase the amount of network computation [17]. erefore, an inception network structure is proposed in the GoogleNet network, which can make full use of the features extracted by each layer of the network, and the structure can increase the depth and complexity of the network while ensuring the network computing complexity

  • Some network parameter adjustments are proposed for the basic structure of the GoogleNet network: (1) the size of the feature map of the input network should be slowly reduced in the network to avoid the bottleneck of the feature representation in the image; (2) high-dimensional features are easier to obtain than low-dimensional features and can accelerate the training speed of the model; (3) the adjacent neurons in the network have close correlations and can be integrated into spatial relations in low-dimensional spaces

Read more

Summary

Introduction

FR technology has been extensively adopted in identity recognition, but it is mainly used to detect biological features in the face for recognition, with strong uniqueness and security [1]. The dataset made up of face images is one that presents a highly nonlinear distribution, and if simple classification method is applied, higher classification errors will occur due to individual differences [2]. The methods commonly used for face detection include principal component analysis, support vector machine, CNN, and active deformation model [3]. Depth model has been applied in many fields, and its application in image recognition is the first to be concerned. The depth model is used for FE in the image, which is far better than manual FE, and can be effectively applied in the field where manual FE is not perfect [4]. In DL, multilayer network structure is mostly applied, which can fuse the bottom features in the image to form the top features [5]. Since 2012, AlexNET used DCNN to reduce the recognition error rate of the top 5 to 16.4%, and DCNN algorithm has been adopted in the subsequent champion recognition models [6, 7]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call