Abstract

The standard speech feature extractors such as Mel-Frequency Cepstral Coefficients (MFCC) and Linear Prediction Coefficients (LPC) fail to perform well under noisy conditions. In this paper two noise less-susceptible features are proposed to mitigate the deficiency of MFCC and LPC. Statistical descriptors of Mel-Bands Spectral Energy (MBSE) is applied to the traditional filter-bank analysis, however, this technique increases the feature size. This issue is tackled by proposing a transformation using principle component analysis to generate a new PCA-MBSE feature set. Two types of utterances namely isolated words and continuous speech were elicited from 103 university volunteers in a controlled room to collect speech signals from three main ethnic groups in Malaysia. This study employed two classifiers namely K-nearest neighbors and artificial neural networks to recognize between the Malay, Chinese and Indian accents. Experimental results using independent test samples technique indicated promising accuracy rates of 92.7% and 93.0% using the proposed PCA-MBSE features to recognize between the Malay, Chinese and Indian accents on the male and female datasets respectively. It was found that under severe noisy conditions, the standard MFCC and LPC features started to deteriorate faster than the MBSE-based features. PCA-MBSE features showed the most robust quality where its performance was just slightly deteriorated by 17.1% and 13.6% as compared to MBSE features i.e. 33.1% and 31.3% on the male and female datasets respectively. Further poor results of LPC features were obtained indicating deterioration rates of 40.2% and 32.7%, while that of MFCC features of 35.7% and 36.8% for the male and female datasets respectively. As a conclusion, Malaysian English is a not a uniform English variety colored by its diverse ethnic nuances. Incorporating accent analyzers using the proposed techniques in automatic speech recognition can contribute a substantial improvement in noisy environment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.