Abstract
Hidden Markov model based synthesized speech is intelligible but not natural because of over-smoothing of the speech spectra. The purpose of this study is improving naturalness without violating acceptable intelligibility by decomposing the naturalness and intelligibility of synthesized speech using a novel asymmetric bilinear model involving non-negative matrix factorization. Subjective evaluations carried out on English data confirm that the proposed method outperforms original asymmetric bilinear model involving singular value decomposition in factorizing naturalness and intelligibility. Moreover, the performance of the proposed method is comparable with other methods.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have