Abstract

This paper deals with a new and improved approach of Back-propagation learning neural network based likelihood ratio score fusion technique for audio-visual speaker Identification in various noisy environments. Different signal preprocessing and noise removing techniques have been used to process the speech utterance and LPC, LPCC, RCC, MFCC, ΔMFCC and ΔΔMFCC methods have been applied to extract the features from the audio signal. Active Shape Model has been used to extract the appearance and shape based facial features. To enhance the performance of the proposed system, appearance and shape based facial features are concatenated and Principal Component Analysis method has been used to reduce the dimension of the facial feature vector. The audio and visual feature vectors are then fed to Hidden Markov Model separately to find out the log-likelihood of each modality. The reliability of each modality has been calculated using reliability measurement method. Finally, these integrated likelihood ratios are fed to Back-propagation learning neural network algorithm to discover the final speaker identification result. For measuring the performance of the proposed system, three different databases, that is, NOIZEUS speech database, ORL face database and VALID audio-visual multimodal database have been used for audio-only, visual-only, and audio-visual speaker identification. To identify the accuracy of the proposed system with existing techniques under various noisy environment, different types of artificial noise have been added at various rates with audio and visual signal and performance being compared with different variations of audio and visual features.

Highlights

  • Biometric authentication [1] has grown in popularity as a way to provide personal identification

  • The procedure of the facial image preprocessing parts is shown in Figure 4 where Figures 4(d) and 4(e) shows the shape based and appearance based facial feature respectively

  • Experiment results are evaluated according to various dimensions such as optimum value selection of the number of hidden states of Discrete Hidden Markov Model (DHMM), response of the system based on noisy facial images and the system accuracy based on appearance, shape and combined appearance and shape based facial features

Read more

Summary

Introduction

Biometric authentication [1] has grown in popularity as a way to provide personal identification. Person’s identification is crucially significant in many applications and the hike in credit card fraud and identity thefts in recent years indicate that this is an issue of major concern in wider society. Individual passwords, pin identification, or even token based arrangement all have deficiencies that restrict their applicability in a widely networked society. Physiological characteristics are related to the shape of the body and it varies from person to person. Behavioral characteristics are related to the behavior of a person. Some examples in this case are signature, keystroke dynamics, voice, and so on

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call