Abstract

With the increasing popularity of the Internet of Things and smart factories, robust sound acquisition plays an important role in communication and human–machine interaction, for example, in-vehicle voice control interactions, smart home, and robotic voice interaction. However, due to the inherent sensing manners of the widely used microphone system, the existing audio-only sound acquisition system has fundamental limitations in solving the cocktail problem and is difficult to adapt to complex acoustic environments. Considering the advantages of millimeter-wave (mmWave) radar in the field of target positioning and precise vibration measurement, the mmWave vibration measurement-based sound perception has great potential to solve the above problems. However, there are still difficulties in high-quality sound recovery. In this letter, we propose RFMic-Phone, a robust sound acquisition system combining a mmWave radar and a traditional microphone, which combines the sound signals captured by the mmWave radar and microphone, offering a novel approach for reliable sound acquisition in the complex acoustic environment. The microphone is used to obtain the mixed and high-fidelity audio signals, while the radar is used to obtain the vibration of the sound source. To achieve intelligent and effective feature fusion, we employ a deep learning framework with a modified convolutional encoder–decoder neural network structure. Moreover, we propose to utilize the real and imaginary spectra of the target sound source as the input of the network, allowing achieve high-quality target sound signals. The experimental results show that the RFMic-Phone can achieve robust and high-quality sound signals acquisition in a variety of complex acoustic environments.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call