Abstract

Our study aims to improve speech quality despite background noise, which often disrupts clear communication. We focus on developing efficient and effective models that work well on devices with limited resources. We draw inspiration from computational auditory scene analysis techniques to train our models to differentiate speech from background noise while keeping computational demands low. We introduce two models: CRN-WRC (Convolutional Recurrent Network without Residual Connections) and CRN-RCAG (Convolutional Recurrent Network with Residual Connections and Attention Gates). Our thorough testing shows that our models significantly enhance speech quality and understanding, even in noisy environments with varying background noise levels. Notably, the CRN-RCAG model consistently outperforms the CRN-WRC, particularly in handling untrained noise types. We achieve impressive results by integrating residual connections and attention gates into our models while maintaining computational efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call