Abstract
In this contribution, a novel maximum likelihood (ML) based direction of arrival (DOA) estimator for concurrent speakers in a noisy reverberant environment is presented. The DOA estimation task is formulated in the short-time Fourier transform (STFT) in two stages. In the first stage, a single local DOA per time-frequency (TF) bin is selected, using the W-disjoint orthogonality property of the speech signal in the STFT domain. The local DOA is obtained as the maximum of the narrow-band likelihood localization spectrum at each TF bin. In addition, for each local DOA, a confidence measure is calculated, determining the confidence in the local estimate. In the second stage, the wide-band localization spectrum is calculated using a weighted histogram of the local DOA estimates with the confidence measures as weights. Finally, the wide-band DOA estimation is obtained by selecting the peaks in the wide-band localization spectrum. The results of our experimental study demonstrate the benefit of the proposed algorithm in a reverberant environment as compared with the classical steered response power phase transform (SRP-PHAT) algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.