Abstract

Humans have an amazing ability to monaurally locate sound sources by using various acoustical cues. However, it is not clear how humans can perceive the direction-of-arrival (DOA) of a sound source in three-dimensional (3D) space using monaural acoustic cues. It has been reported that humans can perceive modulation cues in the auditory-localization process. Therefore, it has become important to determine whether monaural DOA can be estimated by analyzing the modulation information of the observed signal. We propose a monaural modulation spectrum (MMS)-based monaural DOA-estimation method. Through polynomial regression analysis, we calculated the regression equation between monaural DOA and MMS features and modeled the features. This method is based on this polynomial regression model. We carried out simulations with several signal types and participants to simultaneously estimate the azimuth and elevation of an incoming sound source. The evaluation results indicate that the proposed method could adequately estimate the DOA of artificial amplitude-modulation (AM) noise signals in 3D space with a root mean square error (RMSE) of 5.59 degrees compared with the mean error of 12 degrees of human monaural hearing. We also carried out simulations using speech signals to determine the method’s applicability. The evaluation using test speech signals resulted in a mean RMSE of 6.81 with a small standard deviation. Compared with the evaluation results of AM noise, the mean RMSE increased by only 22.2%. This suggests that the proposed method is applicable to realistic monaural DOA estimation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call