Abstract
An auditory attention model that consists of binaural source segregation and also full localization of a target speech signal in a multi-talker environment is presented. The joint acoustic features, such as monaural, binaural and direct to reverberant ratio (DRR) that are successfully incorporated into deep recurrent neural network (DRNN) based joint discriminative model for the speech source segregation process. The monaural and binaural features are extracted from binaural speech mixtures of two speakers by using mean Hilbert envelope coefficients (MHEC) and interaural time, and level differences, respectively. The performance of deep recurrent network based speech segregation is validated in terms of signal to interference, signal to distortion and signal to artifacts and compared with existing architectures, including deep neural network (DNN). The proposed system is observed and found to be more suitable than monaural speech segregation especially when the desired target and interfering sources are located at different positions. The study also proposes full localization of segregated speech source that created the possibility to select the desired speaker of interest from an input acoustic speech mixture in a reverberant environment. The developed system has the capability to handle binaural segregation problem in multi-source and reverberation conditions. The auditory attention model provides accurate information about speech sources even when the desired targets are located at 2 m and above with higher reverberation time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.