Abstract

This paper presents a dialect recognition system for the Kurdish language using speaker embeddings. Two main goals are followed in this research: first, we investigate the availability of dialect information in speaker embeddings, then this information is used for spoken dialect recognition in the Kurdish language. Second, we introduce a public dataset for Kurdish spoken dialect recognition named Zar. The Zar dataset comprises 16,385 utterances in 49h-36min for five dialects of the Kurdish language (Northern Kurdish, Central Kurdish, Southern Kurdish, Hawrami, and Zazaki). The dialect recognition is done with x-vector speaker embedding which is trained for speaker recognition using Vox-celeb1 and Voxceleb2 datasets. After that, the extracted x-vectors are used to train support vector machine (SVM) and decision tree classifiers for dialect recognition. The results are compared with an i-vector system that is trained specifically for Kurdish spoken dialect recognition. In both systems (i-vector and x-vector), the SVM classifier with 86% of precision results in better performance. Our results show that the information preserved in the speaker embeddings can be used for automatic dialect recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.