Abstract

Laser Doppler Vibrometers (LDVs) are exceptionally well suited to non-contact vibration sensing applications in various environments. This work focuses on diarisation of conversations that might be recorded via a drone-mounted LDV by reducing the effect of external noise, extracting useful features from frames of audio and clustering them into homogenous segments based on speaker identity. The two-step noise reduction (TSNR) technique was introduced to these vibroacoustic data for the first time and tested against Gaussian bandpass filtering for noise reduction from sources such as laser speckle and additional broadband ‘white’ noise. Feature extraction was then performed using a time-delay neural network, with the grouping of frames to a particular speaker tested with various clustering methods. Each noise reduction and clustering technique combination were tested on a twospeaker conversation recorded via the LDV. In the case of no added noise, the most effective combination was found to be the TSNR/Agglomerative Hierarchical Clustering (AHC) combination with a diarisation error rate of 6.13%. In the case of additional broadband noise, the most effective combination was found to be TSNR followed by Gaussian bandpass filtering then clustering via AHC with a diarisation error rate of 11.9%. With this work, another aspect of the challenge of covertly obtaining and interpreting vibroacoustic intelligence in remote and hostile environments using LDVs has been addressed.

Highlights

  • Laser Doppler Vibrometry originates from the 1964 alternative to examining flow streamlines of injected dyes to measure fluid flow velocity [1]

  • It is noteworthy that preceding with the Gaussian bandpass filter only resulted in the worst combined performance for all clustering approaches; in two of the three cases, the effectiveness was reduced when compared to the corresponding case where no noise reduction technique at all was used

  • This paper has presented the development and testing of various methods for speech diarisation of vibroacoustic data intended for the end goal of enabling transcription and enabling drone-mounted laser Doppler vibrometer (LDV) for use in remote, non-invasive and covert intelligence gathering

Read more

Summary

Introduction

Laser Doppler Vibrometry originates from the 1964 alternative to examining flow streamlines of injected dyes to measure fluid flow velocity [1] This well-established non-contact surface vibration measurement technique can be extended to the situation in which pressure waves produced by human speech cause nearby objects to vibrate. Using a laser Doppler vibrometer (LDV), the object's vibrations can be measured, and the original speech signal acquired [2],[3] This speech acquisition can be performed remotely and unobtrusively by mounting the LDV to a drone [4]. Once collected and processed, such intelligence can be used to create a transcript of the acquired conversation The actual experimental setup used to collect the data tested throughout this paper is described in Figure 1 (b)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call