Abstract
A sound source localization system is implemented that uses only three microphones to input sound signals. This system can estimate the azimuth and elevation of a sound source in real-time and in sufficient accuracy. We add a SNR measure besides spectra entropy to help detect voiced frames. Next, synchronous FFT phase copying is adopted, and cross-power spectrum phase is calculated to estimate TDOA (time delay of arrival) for each frame. Also, to enhance the accuracy of TDOA, parabolic interpolation is adopted. Then, by comparing the estimated TDOA values with theoretic ones, the azimuth and elevation of a sound source can be determined. Since a pair of azimuth and elevation is estimated from each voiced frame, these estimated values are thereafter summed with a weighting method to give one final answer of azimuth and elevation. According to the experiment results, the average errors in estimating azimuth and elevation are 4.02 and 2.18 degrees, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.