Abstract

Pitch extraction is important for communication robots, since pitch may carry information about intention, attitude or emotion expression from the user's speech. However, current pitch extraction methods are not robust enough in real noisy environments. In the present work, we propose pitch extraction methods by combining microphone array and auditory scene analysis technologies, and evaluate pitch extraction of multiple speakers in real noisy environments. Evaluation results show that the proposed ML-PSACF (maximum likelihood adaptive beamformer with peak-pruned summary autocorrelation function) contributes to reduce the effects of interference and noise, leading to improvements of 23%±5% on pitch estimation rates, in comparison to the baseline of not using array processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call