Voice Recognition Using k Nearest Neighbor and Double Distance Method

Ranny Ranny

doi:10.1109/icimsa.2016.7504045

Abstract

Voice recognition process is started with voice feature extraction using Mel Frequency Cepstrum Coefficient (MFCC). The purpose of the MFCC method is to get the signal feature that correlate to the human voice. The converted signal from analog to digital is needed in the MFCC method. The digital signal has a time domain and it make the analysis harder. So, the domain time is converted to time domain for make the analysis more accurate. Furthermore, after get the feature, the recognition step is using k Nearest Neighbor (kNN) method with k number is one. Euclidean Distance is used to get the similarity of the data training and data testing. The previous research shows that kNN has a high accuracy if use normal data, but it has lower accurate when using outlier data. Base on this problem, this research develop a new method to handle the outlier data using kNN and double distance measurement. The double distance method is note each distance of each data to the center of the kNN data. The calculation of the distance is used on recognition step. The accuracy of the method is tested on experiment. The experiment is using 11 subjects as data training and data testing. Each voice of subject is recorded three times. The result of the experiment with kNN method with one data center is 84.85% and the experiment result using double distance measurement is 96.97%. The result shows that the double distance method increase the accuracy of voice recognition.

Full Text