Abstract
To enhance the speech signal in noisy environments, a Forward Blind Source Separation (FBSS) structure is frequently employed. This structure retrieves speech signal at the output from two noisy observations at the input. However, most speech enhancement methods based on FBSS use manual Voice Activity Detection (VAD) system. In this work, we propose a new algorithm based on FBSS and a Deep Neural Network (DNN) system for automatic VAD (denoted DVAD). This algorithm uses a multi-classification mechanism to identify various types of noise, and a deep DVAD for each specific type. We constructed, in the first part, the DNN models using the TIMIT database with various types of noise. The dataset was subsequently partitioned into three segments: 75% for training, 15% for validation, and the remaining portion for testing purposes. After preparing the recordings, we combined them with six different types of noise from another collection called NOISEX-92. In the second part, we integrated the DVAD system in the FBSS to cancel the noise component from noisy observations. This algorithm yields better results even under negative SNR environments. We show the efficiency of the proposed algorithm in terms of objective and subjective criteria.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.