Abstract

Automatic speaker verification (ASV) systems have maximum threat from replay spoofing attacks. High frequency regions of the underlying audio signal exhibit the phenomenon about their presence. It is therefore useful to decompose the underlying audio signal into frequency bands or regions for possible analysis. In this paper, an empirical mode decomposition (EMD)-based replay spoofing detection system is presented. Using EMD, each signal is decomposed into several monotonic intrinsic mode functions (IMFs). The signal is reconstructed and represented using one or more subsets of these IMFs by performing different combinations for spoofing detection. Results on ASVspoof 2017 version 2.0 and AVspoof benchmark replay attack datasets indicate that there is a potential in initial IMFs to carry replay attack patterns, and that is sufficient rather than processing the entire signal. The proposed approach can also serve as a preprocessing technique by employing dimension reduction strategy. Cross-corpus experiments on the systems indicate the limitations of ASV antispoofing systems due to mismatched conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call