Abstract
Text-independent voice recognition of the user using short sentences is a very difficult task due to the large spread and inconsistency of the content between short sentences, in order to improve user recognition by voice, it is planned to highlight several sets of distinguishing features that contain more information related to the voice. The results show that the i-vector DNN system is superior to the GMM i-vector system for various durations. However, the characteristics of both systems deteriorate significantly as the duration of the sentences decreases. To solve this problem, we propose two new nonlinear mapping methods that train DNN models to map i-vectors extracted from short sentences to their corresponding i-vectors of long sentences.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Bulletin of the National Engineering Academy of the Republic of Kazakhstan
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.