Abstract

Laser monitoring has received more and more attention in many application fields thanks to its essential advantages. The analysis shows that the target speech in the laser monitoring signals is often interfered by the echoes, resulting in a decline in speech intelligibility and quality, which in turn affects the identification of useful information. The cancellation of echoes in laser monitoring signals is not a trivial task. In this article, we formulate it as a simple but effective additive echo noise model and propose a cascade deep neural networks (C-DNNs) as the mapping function from the acoustic feature of noisy speech to the ratio mask of clean signal. To validate the feasibility and effectiveness of the proposed method, we investigated the effect of echo intensity, echo delay, and training target on the performance. We also compared the proposed C-DNNs to some traditional and newly emerging DNN-based supervised learning methods. Extensive experiments demonstrated the proposed method can greatly improve the speech intelligibility and speech quality of the echo-cancelled signals and outperform the comparison methods.

Highlights

  • As an emerging monitoring technology with great potential and many natural advantages, such as non-contacted listening, high concealment, and not susceptible to electromagnetic interference, laser monitoring has become a necessary technology and tool for public security departments, military to conduct investigations, forensics, and intelligence acquisition [1,2]

  • For echo cancellation in laser monitoring problem, we formulate it as a simple but effective additive echo noise model. Armed with this understanding of echoes, which are regarded as noise, we propose cascaded deep neural network (DNN) (C-DNNs) to learn the ratio mask

  • Based on our observation of laser monitoring signals, this article uses an additive noise model to formulate the noisy signal and the echoes. This model appears to be rude at first glance, the simplification of this model has facilitated the design of C-DNNs, allowing us to continually improve the estimation of the ratio mask by continuously updating the training objectives

Read more

Summary

Introduction

As an emerging monitoring technology with great potential and many natural advantages, such as non-contacted listening, high concealment, and not susceptible to electromagnetic interference, laser monitoring has become a necessary technology and tool for public security departments, military to conduct investigations, forensics, and intelligence acquisition [1,2]. The laser (red dotted line) emitted by the listening system hits the object (e.g., a flower in this sketch) around the person being monitored. Since the surface of the object is affected by the surrounding sound waves (purple wavy line) and generates subtle vibrations, the laser (red dotted line) reflected from the surface of the object contains the oscillation information of the indoor speech wave and is received by the receiving device. In the context of laser monitoring, the signals acquired by the listening system are usually affected by ambient light sources, atmospheric noise, thermal noise, etc. By studying and analyzing the actual laser monitoring signals, the researchers found that the monitored speech is often subject to echo interference, especially those obtained in relatively large meeting rooms

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.