Abstract

Voice-based phishing and wire-fraud attacks have become a topical problem in the recent years due to the emergence of advanced AI-based speech synthesis models. These models can generate realistic speech signal of a known target that is difficult to differentiate from a bonafide voice of real human. This proved to be an issue by the recent security reports related to vishing attacks on bank call centers, fraud or pranking of public figures, and spoofing of voice authentication systems. Current approaches to address voice fraud issue are based on applying an Automatic Speaker Verification (ASV) system. In most cases, these systems are tuned on datasets that consist of wideband quality bonafide and spoofed voice samples. This makes ASV systems vulnerable to speech signal degradation caused by voice encoding in cellular network and Voice over IP (VoIP). However, performance evaluation of ASV (namely Equal Error Rate (EER) estimation) is almost exclusively available only for the cellular networks. Thus, performance of ASV systems for modern VoIP applications remains unclear. In this paper, we evaluate the modern ASV systems on audio compressed with codecs used in both cellular networks (AMR and GSM codecs) and VoIP applications (G.711, G.722, AAC, Lyra and Opus codecs). In addition, ASV performance was tested using popular VoIP application (Discord). Obtained results have shown that codec application results in considerable (up to two times) EER increase compared to the baseline results. Moreover, we observed up to three times increase in EER on data transmitted using Discord. We propose to apply hard samples mining to the training process in order to improve the accuracy of ASV systems on compressed voice samples. It allows to reduce EER from 21% down to 16% even for the most distorted samples obtained after aggressive voice compression by GSM codecs. Note, that improvement for real VoIP application is even higher - with EER on Discord data decrease from 35% to 20%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.