Abstract

Spoofing detection, which discriminates the spoofed speech from the natural speech, has gained much attention recently. Low-dimensional features that are used in speaker recognition/verification are also used in spoofing detection. Unfortunately, they don't capture sufficient information required for spoofing detection. In this work, we investigate the use of high-dimensional features for spoofing detection, that maybe more sensitive to the artifacts in the spoofed speech. Six types of high-dimensional feature are employed. For each kind of feature, four different representations are extracted, i.e. the original high-dimensional feature, corresponding low-dimensional feature, the low- and the high-frequency regions of the original high-dimensional feature. Dynamic features are also calculated to assess the effectiveness of the temporal information to detect the artifacts across frames. A neural network-based classifier is adopted to handle the high-dimensional features. Experimental results on the standard ASVspoof 2015 corpus suggest that high-dimensional features and dynamic features are useful for spoofing attack detection. A fusion of them has been shown to achieve 0.0% the equal error rates for nine of ten attack types.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call