Abstract

AbstractAutomatic speaker verification (ASV) has been widely applied in a variety of industrial scenarios. In ASV, the universal background model (UBM) needs to be trained with a large variety of speaker data so that the UBM can learn the speaker-independent distribution of speech features for all speakers. However, the sensitive information contained in raw speech data is important and private for the speaker. According to the recent European Union privacy regulations, it is forbidden to upload private raw speech data to the cloud server. Thus, a new ASV model needs to be proposed to alleviate data scarcity and protect data privacy simultaneously in the industry. In this work, we propose a novel framework named Federated Speaker Verification with Personal Privacy Preservation, or FedSP, which enables multiple clients to jointly train a high-quality speaker verification model and provide strict privacy preservation for speaker. For data scarcity, FedSP is based on the federated learning (FL) framework, which keeps raw speech data on each device and jointly trains the UBM to learn the speech features well. For privacy preservation, FedSP provides more strict privacy preservation than traditional basic FL framework by selecting and hiding sensitive information from raw speech data before jointly training the UBM. Experimental results on two pair speech datasets demonstrate that FedSP has superior performances in terms of data-utility and privacy preservation.KeywordsSpeaker verificationFederated learningPrivacy preservationSensitive information

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call