Abstract
Audio speech signal contains various important information regarding language spoken, speaker identification, emotion recognition, gender recognition and the phonetic information about the speech being spoken, etc. This paper presents an automatic language identification system using K-means clustering on Mel-Frequency Cepstral Coefficients (MFCCs) for feature extraction and Support Vector Machine (SVM) for classification. Use of K-means clustering for post-processing MFCC features before sending them to the classifier allows considerable reduction in the complexity of SVM classifier, which is otherwise unavoidable due to huge number of MFCC features from each speech signal. The performance of the proposed system is tested on a custom speech database of three Indian languages: English, Hindi, and Tibetian. The proposed system shows promising results with an average classification accuracy of 81% using small duration speech signals.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have