Abstract

Speaker recognition is carried out in the space of the functional parameters of the area of the glottal cross-section, found by solving the inverse problem. This problem is solved in two stages: first, the signal obtained by inverse filtering is approximated using the vocal source model, and then the glottal area model parameters, which generate the calculated vocal source impulse, are computed. Speaker recognition is carried out on a database of Russian numerals from 0 to 9 separately for men (48 speakers) and women (37 speakers) at the segments of stressed vowels. Various methods of recognition are studied: the Gaussian mixture model (GMM), support vector machines (SVMs), discriminant analysis, naive Bayes classifier (NB), the method of classification trees (CTREE), and the Parzen window classifier. The best results were obtained using the method of SVMs and the Parzen method: the average total error of identification of men was 4.9% and 5.1%, and that of women--8.2% and 8.8%, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.