Abstract

Audio speech signal contains various important information regarding language spoken, speaker identification, emotion recognition, gender recognition and the phonetic information about the speech being spoken, etc. This paper presents an automatic language identification system using K-means clustering on Mel-Frequency Cepstral Coefficients (MFCCs) for feature extraction and Support Vector Machine (SVM) for classification. Use of K-means clustering for post-processing MFCC features before sending them to the classifier allows considerable reduction in the complexity of SVM classifier, which is otherwise unavoidable due to huge number of MFCC features from each speech signal. The performance of the proposed system is tested on a custom speech database of three Indian languages: English, Hindi, and Tibetian. The proposed system shows promising results with an average classification accuracy of 81% using small duration speech signals.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call