Abstract
The aim of this paper is to measure the recognition capability of composite features extracted from speech signal and compare the result with other individually considered features for both spoken word and speaker based recognitions. Standard features like formants (F1, F2, F3), Linear Predictive Coefficients (LPC) and Mel Frequency Cepstral Coefficients (MFCC) along with various combinations among them are considered for the task to arrive at the conclusion. Six different speakers and six different strings (words) are considered in the present study. The threshold is set through an iterative approach for both spoken word and speaker recognition experiments. The mixing of LPC and MFCC is found to be the most promising combination among all others. Another interesting conclusion that we can draw from the study that the composite feature approach gives accuracy very near to 100% in case of speaker recognition task as compared to spoken word recognition task.. General Terms Digital Signal Processing, Pattern Recognition, Artificial Neural Network, Automatic Speech and Speaker Recognition
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have