Abstract

Robot capability of hearing sounds, in particular, a mixture of sounds, by its own microphones, that is, robot audition, is important in improving human robot interaction. This paper presents the robot audition open-source software, called (HRI-JP Audition for Robots with Kyoto University), which consists of primitive functions in computational auditory scene analysis; sound source localization, separation, and recognition of separated sounds. Since separated sounds suffer from spectral distortion due to separation, the HARK generates a time-spectral map of reliability, called feature mask, for features of separated sounds. Then separated sounds are recognized by the missing-feature theory (MFT) based ASR with missing feature masks. The HARK is implemented on the middleware called FlowDesigner to share intermediate audio data, which enables near real-time processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call