Abstract
In this paper, we present a deep neural network architecture comprising of both convolutional neural network (CNN) and recurrent neural network (RNN) layers for real-time single-channel speech enhancement (SE). The proposed neural network model focuses on enhancing the noisy speech magnitude spectrum on a frame-by-frame process. The developed model is implemented on the smartphone (edge device), to demonstrate the real-time usability of the proposed method. Perceptual evaluation of speech quality (PESQ) and short-time objective intelligibility (STOI) test results are used to compare the proposed algorithm to previously published conventional and deep learning-based SE methods. Subjective ratings show the performance improvement of the proposed model over the other baseline SE methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.