Abstract

Efficient and robust sound source recognition and localization is one of the basic techniques for humanoid robots in terms of reaction to environments. Due to the fixed sensor arrays and limited computation resources in humanoid robots, there comes challenge for sound source recognition and localization. This article proposes a sound source recognition and localization framework to realize real-time and precise sound source recognition and localization system using humanoid robots’ sensor arrays. The type of the audio is recognized according to the cross-correlation function. And steered response power-phase transform function in discrete angle space is used to search the sound source direction. The sound source recognition and localization framework presents a new multi-robots collaboration system to get the precise three-dimensional sound source position and introduces a distance weighting revision way to optimize the localization performance. Additionally, the experiment results carried out on humanoid robot NAO demonstrate that the proposed approaches can recognize and localize the sound source efficiently and robustly.

Highlights

  • Humanoid robots have been designed to interact with people and react to environments

  • Due to the fixed sensor arrays in humanoid robots and the fact that a sound event occurs at an arbitrary direction in three-dimensional (3D) space with noise and reverberation, there comes challenge for sound source recognition and localization (SSRL)

  • This article presents an SSRL framework approach implemented for 3D SSRL on humanoid robots

Read more

Summary

Introduction

Humanoid robots have been designed to interact with people and react to environments. Given X M ðnÞ as the signal segments received by the M th microphone and qðx; y; zÞ as the assumption sound source position, the SRP-PHAT feature function[31] is defined as equation (10). In “Multi-robot 2D SSL using distance weighting revision” and “Three-dimensional sound source localization” sections, we will revise the 3D position to improve the result precision. The localization resolution in discrete angle space is limited by the humanoid robot’s microphone array and sampling frequency since we use TDOA as our localization cue.[32]. Closer distance to microphone means higher credibility of the identified sound source direction Using this criterion, 2D SSL revises the estimated 2D position by distance weighting revision. In 2D SSL, all robots’ sound direction information is combined into uncorrected position, and the closest robot distance with corresponding angle is used to correct the position.

Experiments and analysis
Method
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call