Abstract
It is well known that speaker identification yields very high performance in a neutral talking environment, on the other hand, the performance has been sharply declined in a shouted talking environment. This work aims at proposing, implementing, and evaluating novel Third-Order Circular Suprasegmental Hidden Markov Models (CSPHMM3s) to improve the low performance of text-independent speaker identification in a shouted talking environment. CSPHMM3s possess combined characteristics of: Circular Hidden Markov Models (CHMMs), Third-Order Hidden Markov Models (HMM3s), and Suprasegmental Hidden Markov Models (SPHMMs). Our results show that CSPHMM3s are superior to each of: First-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM1s), Second-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM2s), Third-Order Left-to-Right Suprasegmental Hidden Markov Models (LTRSPHMM3s), First-Order Circular Suprasegmental Hidden Markov Models (CSPHMM1s), and Second-Order Circular Suprasegmental Hidden Markov Models (CSPHMM2s) in a shouted talking environment. Using our collected speech database, average speaker identification performance in a shouted talking environment based on LTRSPHMM1s, LTRSPHMM2s, LTRSPHMM3s, CSPHMM1s, CSPHMM2s, and CSPHMM3s is 74.6%, 78.4%, 81.7%, 78.7%, 83.4%, and 85.8%, respectively. Speaker identification performance that has been achieved based on CSPHMM3s is close to that attained based on subjective assessment by human listeners.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.