Abstract

This paper studies the methods of speech bandwidth extension (BWE) using artificial neural networks. Several types of neural networks, including bidirectional neural networks such as restricted Boltzmann machines (RBM) and bidirectional associative memories (BAM), and feedforward deep neural networks (DNNs), are employed to restore high frequency spectral envelopes from low frequency ones. Compared with Gaussian mixture models (GMM) which are popularly adopted in the conventional statistical approaches to BWE, neural networks are better at modeling the complex and non-linear mapping relationship between high-dimensional feature vectors. Experimental results show that the neural network based BWE methods proposed in this paper can achieve better performance than the GMM-based one in both objective and subjective tests. Furthermore, the DNN-based BWE method outperforms the BAM and RBM-based ones which use shallow model structures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.