Abstract

Recently we have explored the use of a Gaussian mixture model (GMM) based global transformer for artificial bandwidth extension (ABWE) for improving the automatic recognition of children's speech in mismatched condition. As the spectral characteristic of the speech varies significantly from one sound class to another so the global transformation would be sub-optimal for that purpose. Motivated by that in this work, we explore the use of class specific GMM based ABWE transformers for the bandwidth extension of the narrowband speech. For the deriving the class specific ABWE transformers an existing unsupervised hidden Markov model (HMM) based method is used. Further for contrast purpose an supervised class specific GMM based ABWE transformers are also explored. The unsupervised and supervised class specific ABWE approaches have resulted in 21.30% and 26.37% relative improvement in word error rate on digit recognition task. The effectiveness of class specific ABWE is also explored in terms of the mutual information between narrowband and the extended higherband speech as well as a group of other speech quality measures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call