Abstract
In the field of remote surveillance, acquiring the high-quality voice of target has always been an exciting goal. In this paper, we propose a method based on convolutional neural network to extract the target’s speech signals remotely. The method consists of two parts: the optical setup enables us to obtain speckle images conveniently and covertly, and the convolutional neural model is used to recover speech signals from continuous speckle images. Correlation coefficient and root mean square error metrics show the effectiveness of our method for high-quality speech extraction. Compared to the traditional spatial image correlation, our convolutional neural model is more accurate and more efficient in speckle image processing. The model gets an average accuracy of 94% on real data and 98% on simulated data, which is far better than the spatial image correlation. Besides, by using GPU hardware, the model can process speckle images up to 237 frames per second, far more than 10 frames per second of the spatial image correlation. Experimental results show that the method is simple, efficient and accurate, which proves our significant progress in the field of remote sound extraction.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.