With the increasing demand for security in many fastest growing applications, biometric recognition is the most prominent authentication system. User authentication through speech and face recognition is the important biometric technique to enhance the security. This paper proposes a speech and facial feature-based multi-modal biometric recognition technique to improve the authentication of any system. Mel Frequency Cepstral Coefficients (MFCC) is extracted from audio as speech features. In visual recognition, this paper proposes cascade hybrid facial (visual) feature extraction method based on static, dynamic and key-point salient features of the face and it proves that the proposed feature extraction method is more efficient than the existing method. In this proposed method, Viola–Jones algorithm is used to detect static and dynamic features of eye, nose, lip, Scale Invariant Feature Transform (SIFT) algorithm is used to detect some stable key-point features of face. In this paper, a research on the audio-visual integration method using AND logic is also made. Furthermore, all the experiments are carried out using Artificial Neural Network (ANN) and Support Vector Machine (SVM). An accuracy of 94.90% is achieved using proposed feature extraction method. The main objective of this work is to improve the authenticity of any application using multi-modal biometric features. Adding facial features to the speech recognition improve system security because biometric features are unique and combining evidence from two modalities increases the authenticity as well as integrity of the system.
Read full abstract