Abstract

The rapid increase in web services and mobile apps, which collect personal data from users, has also increased the risk that their privacy may be severely compromised. In particular, the increasing variety of spoken language interfaces and voice assistants empowered by the vertiginous breakthroughs in deep learning have prompted important concerns in the European Union in terms of preserving the privacy of speech data. For instance, an attacker can record speech from users and impersonate them to obtain access to systems that require voice identification. By extracting speaker, linguistic (e.g., dialect), and paralinguistic features (e.g., age) from a speech signal, the speaker profiles can also be hacked from users through existing technology.To mitigate these weaknesses, in this study, we present a speech anonymization method based on autoencoders and adversarial training. Given an utterance, we first extract an x-vector, which is a powerful utterance-level embedding used in state-of-the-art speaker recognition. This original x-vector is transformed by an autoencoder producing a new x-vector, where speaker, gender, and accent information are suppressed through adversarial training. The anonymized speech is finally generated through a neural speech synthesizer driven by the anonymized x-vector, fundamental frequency, and phoneme information extracted from the input speech.For the evaluation, we followed the VoicePrivacy Challenge framework, where anonymization or privacy is measured using automatic speaker verification and the preservation of the intelligibility is evaluated through automatic speech recognition. Our experimental results show that the proposed method achieves higher privacy than the VoicePrivacy baseline (i.e., a higher speaker verification error) while preserving a similar intelligibility for the spoken content (i.e., a similar word error rate).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.