Abstract

Presented is a method to mitigate noise and interference in automated speaker identification (SID). This process uses the MIT/LL SID module without modifications. In this process, speaker models are built for a lattice of signal to noise ratio (SNR) levels. The SNR of the received signal is estimated by first applying speech activity detection to identify portions of the signal that actually contain speech. A voice quality estimation process is then applied to estimate the SNR of the received signal. The speaker models representing the SNR of the received signal are dynamically loaded, and conventional SID is applied. In training, the SNR of each training signal is estimated, and the signal is modified by adding noise to create a signal at the desired SNR. Using this process, each signal may be used to train models at any SNR level less than or equal to the SNR of the original signal. The process has been fully implemented and is completely automated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call