Abstract

Speech Synthesis (SS) and Voice Conversion (VC) presents a genuine risk of attacks for Automatic Speaker Verification (ASV) technology. In this paper, we evaluate front-end anti-spoofing technique to protect ASV system for SS and VC attack using a standard benchmarking database. In particular, we propose a novel feature set, namely, Energy Separation Algorithm-based Instantaneous Frequency Cosine Coefficients (ESA-IFCC) to detect the genuine and impostor speech. The experiments are carried out on ASV Spoof 2015 Challenge database. On the development set, the score-level fusion of proposed ESA-IFCC feature set with Mel Frequency Cepstral Coefficients (MFCC) gave an EER of 3.45 %, which reduced significantly from MFCC (6.98 %) and ESA-IFCC (5.43 %) with 13-D static features. The EER decreases further to 2.01 % and 1.89 % for Δ and ΔΔ features derived from proposed ESA-IFCC features, respectively. The overall average error rate for known and unknown attacks in evaluation set was 6.79 % for ESA-IFCC and was significantly better than the MFCC (9.15 %) features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call