Abstract

In this paper, we propose a fast time-frequency mask technique that relies on the sparseness of source signals for blind source separation (BSS) to separate a mixture of two input sounds in a single signal automatically. Due to the sparseness of source signals, the signal can be distinguished when it is transformed into the time-frequency domain. Most previous methods did not mention the effect of different angles on accuracy. To overcome such problems, we first define two features which are normalized level-ratio and phase-difference. Next, we use our method to decrease the variance of Direction of Arrival (DOA). This can reduce the variance of features so that it can reduce the iterations of k-means. Finally, a mask is generated according to the clustered features. Our method does not require any prior information or parameter estimation. The motivation of the proposed design is to incorporate the BSS system with some smart voice appliances. In the application scenario, all the non-human voices may appear and regard as interference. We use Signal to Distortion Ratio (SDR) and Signal to Interference Ratio (SIR) to make some comparison. Based on the proposed system, then we present a hardware design. We use the TSMC 90-nm CMOS process. As a cost-effective result, it consumes about 120 K gates and executes with a frequency of 10 MHz. The power consumption is only 2.92 mW with low power design considerations.

Highlights

  • Blind source separation (BSS) is a technique to estimate individual source components from their mixtures at multiple sensors

  • SUMMARY: In this paper, we propose a fast time-frequency mask technique for blind source separation in order to separate a mixture of two input sounds in single signal automatically

  • We use our method to decrease Direction of Arrival (DOA), this can reduce the variance of features so that it can reduces iterations of k-means

Read more

Summary

Introduction

Blind source separation (BSS) is a technique to estimate individual source components from their mixtures at multiple sensors. In many real-world applications, such as in acoustics, the mixing process is more complex In such systems, the mixtures are weighted and delayed, where each source contributes to the sum with multiple delays corresponding to the multiple paths by which an acoustic signal propagates to a microphone. The frequency bands in between do not carry any energy This means that in the time-frequency domain, only a few frequency bins have high values at each time instance, while most frequency bins have a value close to zero. This is a sparse signal by definition. Using a sparse signal representation is very important in order to ensure good separation performance since the separation is built on the assumption of sparse source signals

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call