Abstract

Automatic Speaker Verification (ASV) systems are developed to verify the claimed identity of a speaker based on speech samples. The technological advances have given pathways to practical ASV systems that showcase the threat towards spoofing attacks. Replay is one of the spoofing attacks where the ASV systems are fooled with pre-recorded speech samples of a target speaker. In this context, both magnitude-based and phase-based spectral features get affected by the quality of intermediate devices and their environments. There have been only a few studies reported to detect the replay attacks based on the phase features. In this paper, we explore the relative significance of various phase-based features for detecting replay attacks. The magnitude-based features are chosen to perform score-level fusion with phase-based features to capture the possible complementary information. Among various possible combinations of magnitude and phase-based features, we obtain 12.25 % as the best Equal Error Rate (EER) which is less than that obtained with individual feature set, while the score-level fusion of phase-based features gave an EER of 13.14 % on the evaluation set of ASVspoof 2017 version 1 database.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call