Abstract

Speech enhancement has become a promising technique to accommodate demands of the improvement in quality of a degraded speech signal. The main works now focus on separating normal speech from noise, but have neglected the low quality of impaired speech influenced by anomalous glottis flow. In order to effectively enhance the pathological speech, it is essential to design a separation mechanism for extracting high-dimensional timbre features and speech features separately to suppress low-dimensional noises. In this paper, we propose an enhancement model GBNF-VAE to extract timbre efficiently by reducing anomalous airflow noise interference, and by combining the semantic features with timbre features to synthesize the enhanced speech. In particular, the bottleneck feature can characterize the timbre by the controlled number of nodes through the Golden Sectionmethod, which effectively improves computational efficiency. In addition, variational autoencoder is adopted to extract semantic features which are combined with the previous timbre features to synthesize the enhanced speech. Finally, spectrum observation, objective indicators and subjective evaluation all show the outstanding performance of GBNF-VAE in pathological speech quality enhancement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call