Abstract

In this paper, we introduce a new approach to dialect recognition that relies on context-dependent (CD) phonetic differences between dialects as well as phonotactics. Given a speech utterance, we obtain the phone sequence using a CD-phone recognizer. We then identify the most likely dialect of these CDphones using SVM classifiers. Augmenting these phones with the output of these classifiers, we extract augmented phonotactic features which are subsequently given to a logistic regression classifier to obtain a dialect detection score. We test our approach on the task of detecting four Arabic dialects from 30s utterances. We compare our performance to two baselines, PRLM and GMM-UBM, as well as to our own improved version of GMM-UBM which employs fMLLR adaptation. Our approach performs significantly better than all three baselines at 5% absolute Equal Error Rate (EER). The overall EER of our system is 6%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.