Spoken word recognition using MFCC and learning vector quantization

Esmeralda C Djamal,Neneng Nurhamidah,Ridwan Ilyas

doi:10.1109/eecsi.2017.8239119

Abstract

Identification of spoken word(s) can be used to control external device. This research was result word identification in speech using Mel-Frequency Cepstrum Coefficients (MFCC) and Learning Vector Quantization (LVQ). The output of system operated the computer in certain genre song appropriate with the identified word. Identification was divided into three classes contain words such as Klasik, Dangdut and Pop, which are used to playing three types of accordingly songs. The voice signal is extracted by using MFCC and then identified using LVQ. The training and test set were obtained from six subjects and 10 times trial of the words Klasik, Dangdut and Pop separately. Then the recorded sound signal is pre-processed using Histogram Equalization, DC Removal and Pre-emphasize to reduce noise from the sound signal, and then extracted using MFCC. The frequency spectrum generated from MFCC was identified using LVQ after passing through the training process first. Accuracy of the testing results is 92% for identification of training sets while testing new data recorded using different SNR obtained an accuracy of 46%. However, the test results of new data recorded using the same SNR with training data has an accuracy of 75.5%.

Full Text