Abstract
Speech-in-noise perception is an important research problem in many real-world multimedia applications. The noise-reduction methods contributed significantly; however rely on a priori information about the noise signals. Deep learning approaches are developed for enhancing the speech signals in nonstationary noisy backgrounds and their benefits are evaluated for the perceived speech quality and intelligibility. In this paper, a multi-objective speech enhancement based on the Long-Short Term Memory (LSTM) recurrent neural network (RNN) is proposed to simultaneously estimate the magnitude and phase spectra of clean speech. During training, the noisy phase spectrum is incorporated as a target and the unstructured phase spectrum is transformed to its derivative that has an identical structure to corresponding magnitude spectrum. Critical Band Importance Functions (CBIFs) are used in training process to further improve the network performance. The results verified that the proposed multi-objective LSTM (MO-LSTM) successfully outscored the standard magnitude-aware LSTM (MA-LSTM), magnitude-aware DNN (MA-DNN), phase-aware DNN (PA-DNN), magnitude-aware GNN (MA-GNN) and magnitude-aware CNN (MA-CNN). Moreover, the proposed speech enhancement considerably improved the speech quality, intelligibility, noise-reduction and automatic speech recognition in changing noisy backgrounds, which is confirmed by the ANalysis Of VAriance (ANOVA) statistical analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Ambient Intelligence and Humanized Computing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.