Abstract

In speaker identification systems, a database is constructed from the speech samples of known speakers. The approach implemented in this paper is hybrid, where the wavelet transform and neural networks are used together to form a system with improved performance. Features are extracted by applying a discrete wavelet transform (DWT), while a neural network (NN) is used for formulating the system database and for handling the task of decision making. The neural network is trained using inputs, which are the feature vectors. A criteria depends on both false acceptance ratio (FAR) and false rejection ratio (FRR) is used to evaluate the system performance. For experimenting the proposed system, a set of 25 randomly aged male and female speakers was used. Results of admitting the members of this set to a secure system were computed and presented. The evaluation criteria parameters obtained are; FAR=14.5% and FRR=24.5%

Highlights

  • Speaker identification has been a wide and attractive area of research

  • Some of recent works on speaker identification depend on classical features including cepstrum with many variants[2], sub-band processing technique[3,4,5,6], Gaussian mixture models (GMM)[7], linear prediction coding [8,9], wavelet transform[10,11,12] and neural networks[11,12,13]

  • In[11], a hybrid approach of wavelet transform and neural networks is adopted, where the sounds heard over a chest wall, not an uttered ones, are classified such that they can be used for diagnosing pulmonary diseases

Read more

Summary

INTRODUCTION

Speaker identification has been a wide and attractive area of research. Many works based on speech features, were proposed. We consider a hybrid approach, where the feature extraction component is performed using discrete wavelet transform (DWT), while the speaker modeling and speaker matching components are both performed using neural networks. The pitch period or fundamental frequency of speech varies from one individual to another; pitch frequency is high for female voices and low for male voices This suggests that pitch might be a suitable parameter to distinguish one speaker from another, or at least to narrow down the set of probable matches[17]. The analysis of the frequency spectrum of the test utterance provides valuable information about speaker identification The spectrum contains both pitch harmonics and vocal-tract resonant peaks, making it possible to identify the speaker with a high probability of being correct. The matching and decision making processes: The ability of neural networks to accumulate knowledge about objects and processes using learning algorithms

Result
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.