Abstract
This paper reports a comparative study between a continuous hidden Markov model (CHMM) and an artificial neural network (ANN) on a text dependent, closed set speaker identification (SID) system with Thai language recording in office and telephone environment. Thai isolated digit "0–9" and their concatenation are used as speaking text. Mel frequency cepstral coefficients (MFCC) are selected as the studied features. Two well-known recognition engines, CHMM and ANN, are conducted and compared. The ANN system (multilayer perceptron network with backpropagation learning algorithm) is applied with a special design of input feeding methods in avoiding the distortion from the normalization process. The general Gaussian density distribution HMM is developed for CHMM system. After optimizing some system's parameters by performing some preliminary experiments, CHMM gives the best identification rate at 90.4%, which is slightly better than 90.1% of ANN on digit "5" in office environment. For telephone environment, ANN gives the best identification rate at 88.84% on digit "0" which is higher than 81.1% of CHMM on digit "3". When using 3-concatenated digit, the identification rate of ANN and CHMM achieves 97.3% and 95.7% respectively for office environment, and 92.1% and 96.3% respectively for telephone environment.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.