Abstract

This paper reports the findings of an automatic dialect identification (DID) task conducted on Ao speech data using source features. Considering that Ao is a tone language, in this study for DID, the gammatonegram of the linear prediction residual is proposed as a feature. As Ao is an under-resourced language, data augmentation was carried out to increase the size of the speech corpus. The results showed that data augmentation improved DID by 14%. A perception test conducted on Ao speakers showed better DID by the subjects when utterance duration was 3 s. Accordingly, automatic DID was conducted on utterances of various duration. A baseline DID system with the Slms feature attained an average F1-score of 53.84% in a 3 s long utterance. Inclusion of source features, Silpr and S, improved the F1-score to 60.69%. In a final system, with a combination of Silpr, S, Slms, and Mel frequency cepstral coefficient features, the F1-score increased to 61.46%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call