The effects of microphone array processing on pitch extraction in real noisy environments

Carlos T Ishi,Dong Liang,Norihiro Hagita,Hiroshi Ishiguro

doi:10.1109/iros.2011.6094950

Abstract

Pitch extraction is important for communication robots, since pitch may carry information about intention, attitude or emotion expression from the user's speech. However, current pitch extraction methods are not robust enough in real noisy environments. In the present work, we propose pitch extraction methods by combining microphone array and auditory scene analysis technologies, and evaluate pitch extraction of multiple speakers in real noisy environments. Evaluation results show that the proposed ML-PSACF (maximum likelihood adaptive beamformer with peak-pruned summary autocorrelation function) contributes to reduce the effects of interference and noise, leading to improvements of 23%±5% on pitch estimation rates, in comparison to the baseline of not using array processing.

Full Text