Abstract
At present, specific voice control has gradually become an important means for 5G-Internet-of-Things-aided industrial control systems, such as controlling the operation and adjustment of industrial Internet of Things equipment through telephone voice of the controller. However, the security of specific voice control system needs to be improved, because the voice cloning technology based on transfer learning can easily simulate the voice of the controller, which may lead to industrial accidents and other potential security risks. Therefore, this article mainly aims to study and understand the principle of voice cloning attack technology, putting forward a voice clone attack method, in order to prepare for the construction of a specific voice recognition system in the future. At present, the key technology of voice cloning attack is how to solve the problem that the target speaker's personalized speech with high quality cannot be synthesized under small samples. In fact, voice cloning is a very challenging problem because speech is more difficult to be represented in the hidden space of the model. We propose a transductive voice transfer learning method to learn the predictive function from the source domain and fine-tune in the target domain adaptively. The target learning task and the source learning task are both synthesizing speech signals from the given audio, while the datasets of both domains are different. By adding different penalty values to each instances and minimizing the expected risk, an optimal precise model can be learned. In addition, an evaluation method to verify the audio similarity of the target speaker was given to show the similarity between the synthesized audio and the original audio. Many details of the experimental results show that our method can effectively synthesize the speech of the target speaker with small samples.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.