Abstract

Owing to the widespread deployment of face and speaker recognition systems, research on attacks on neural-network-based biometric systems, which involves face or voice signal classification problems with a low-dimensional output vector, has drawn increasing attention. Recently, cross-modal voice-to-face (VTF) systems have learned to generate faces from voices by matching several biometric characteristics of the generated faces to those of speakers. However, attacks focusing on VTF systems with high-dimensional face image outputs have not yet been conducted. In this paper, we introduce various adversarial attack methods for the VTF system under different attack conditions. These methods can generate a fake face close to the target face or far from the original face, by adding subtle perturbations to the original voice. Under the white-box setting, we formulate a multiobjective optimization to generate target faces and improve the imperceptibility of the adversarial sample. Further a stepwise iterative optimization strategy is proposed to achieve faster and more effective attacks. Finally, the results of comparative experiments with various methods are demonstrated. Under the black-box setting, the adversarial samples generated from surrogate models are able to generate the fake face far from the original one. Qualitative and quantitative experimental results show the high target face-matching rate and irrelevance to the original face, as well as the imperceptibility of the adversarial audio. This study provides useful insights for privacy protection and improving generation robustness for information security.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.