Abstract

Small agglomerative microphone array systems have been proposed for use with speech communication and recognition systems. Blind source separation methods based on frequency domain independent component analysis have shown significant separation performance, and the microphone arrays are small enough to make them portable. However, the level of computational complexity involved is very high because the conventional signal collection and processing method uses 60 microphones. In this paper, we propose a band selection method based on magnitude squared coherence. Frequency bands are selected based on the spatial and geometric characteristics of the microphone array device which is strongly related to the dodecahedral shape, and the selected bands are nonuniformly spaced. The estimated reduction in the computational complexity is 90% with a 68% reduction in the number of frequency bands. Separation performance achieved during our experimental evaluation was 7.45 (dB) (signal-to-noise ratio) and 2.30 (dB) (cepstral distortion). These results show improvement in performance compared to the use of uniformly spaced frequency band.

Highlights

  • Speech communication and recognition systems are widely used in the present-day world, generally under reverberant and noisy conditions

  • The proposed band selection method uses the spatial characteristics of a dodecahedral microphone array (DHMA), and a preliminary experiment on magnitude squared coherence describes the criterion of the band selection process

  • The proposed method uses nonuniformly spaced selection of frequency bands, which contributes to improved separation performance versus uniformly spaced band selection in experiments

Read more

Summary

Introduction

Speech communication and recognition systems are widely used in the present-day world, generally under reverberant and noisy conditions. The observed signals include some source speech signals, mixed with each other and with the acoustic sound field. Extracting source signals and their locations, which is called encoding an acoustic field, is an important technique for acoustic schemes such as highly realistic communication and speech recognition systems. Various methods have been proposed to solve this permutation ambiguity such as [2,3,4,5,6,7] and include using power envelopes of separated signals at neighboring frequency channels, similarity between directivity patterns formed by a separation matrix, and large microphone arrays which surround sound sources. A correlation of power envelopes of separated signals can be observed at neighboring frequency channels under a condition that their frequencies are very close to each other. It is difficult that this assumption can be satisfied

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call