Abstract
An electrolarynx (EL) is a widely used device to mechanically generate excitation signals, making it possible for laryngectomees to produce EL speech without vocal fold vibrations. Although EL speech sounds relatively intelligible, is significantly less natural than normal speech owing to its mechanical excitation signals. To address this issue, a statistical voice conversion (VC) technique based on Gaussian mixture models (GMMs) has been applied to EL speech enhancement. In this technique, input EL speech is converted into target normal speech by converting spectral features of the EL speech into spectral and excitation parameters of normal speech using GMMs. Although this technique makes it possible to significantly improve the naturalness of EL speech, the enhanced EL speech is still far from the target normal speech. To improve the performance of statistical EL speech enhancement, in this paper, we propose an EL-to-speech conversion method based on CLDNNs consisting of convolutional layers, long short-term memory recurrent layers, and fully connected deep neural network layers. Three CLDNNs are trained, one to convert EL speech spectral features into spectral and band-aperiodicity parameters, one to convert them into unvoiced/voiced symbols, and one to convert them into continuous $F_{0}$ patterns. The experimental results demonstrate that the proposed method significantly outperforms the conventional method in terms of both objective evaluation metrics and subjective evaluation scores.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.