Abstract

In order to understand utterance based human-robot interation, and to develop such a system, this paper initially analyzes how loud humans speak in a noisy environment. Experiments were conducted to measure how loud humans speak with 1) different noise levels, 2) different number of sound sources, 3) different sound sources, and 4) different distances to a robot. Synchronized sound sources add noise to the auditory scene, and resultant utterances are recorded and compared to a previously recorded noiseless utterance. From experiments, we understand that humans generate basically the same level of sound pressure level at his/her location irrespective of distance and background noise. More precisely, there is a band according to a distance, and also according to sound sources that is including language pronounce. According to this understanding, we developed an online spoken command recognition system for a mobile robot. System consists of two key componenets: 1) Low side-lobe microphone array that works as omini-directional telescopic microphone, and 2) DSBF combined with FBS method for sound source localization and segmentation. Caller location and segmented sound stream are calculated, and then the segmented sound stream is sent to voice recognition system. The system works with at most five sound sources at the same time with about at most 18[dB] sound pressure differences. Experimental results with the modile robot are also shown.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call