Abstract
Considering the practical use of speaker verification systems, it is important to investigate the effect of spoofing attacks by professional/non-professional voice imitation. This research proposes an analysis method of voice imitation using the distances between three GMM-based acoustic models trained from cepstral features of “impostor's original voice,” “target person's voice,” and “impostor's imitated voice.” The distance measure is defined by the Kullback-Leibler (KL) divergence. The analysis uses Japanese imitated voice produced by a male professional impersonator and six male non-professional impostors. Each impostor imitated five or six target persons who have never been tried to imitate by the impostors before the experiments. The analysis results show that 1) although the non-professional imitators drastically change their voice features by the imitation, the averaged acoustical distance between the imitated and target voice is still large, 2) whereas the professional imitator approaches their voice characteristics towards the target voice; the distance between the imitated and target voice is approximately 70% of the original distance. The experiments of speaker verification using HMM-UBM-based framework show that the professional imitation certainly yields higher equal error rates than that of non-professional imitation.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.