Abstract

<span>Determining and classifying pathological human sounds are still an interesting area of research in the field of speech processing. This paper explores different methods of voice features extraction, namely: Mel frequency cepstral coefficients (MFCCs), zero-crossing rate (ZCR) and discrete wavelet transform (DWT). A comparison is made between these methods in order to identify their ability in classifying any input sound as a normal or pathological voices using support vector machine (SVM). Firstly, the voice signal is processed and filtered, then vocal features are extracted using the proposed methods and finally six groups of features are used to classify the voice data as healthy, hyperkinetic dysphonia, hypokinetic dysphonia, or reflux laryngitis using separate classification processes. The classification results reach 100% accuracy using the MFCC and kurtosis feature group. While the other classification accuracies range between~60% to~97%. The Wavelet features provide very good classification results in comparison with other common voice features like MFCC and ZCR features. This paper aims to improve the diagnosis of voice disorders without the need for surgical interventions and endoscopic procedures which consumes time and burden the patients. Also, the comparison between the proposed feature extraction methods offers a good reference for further researches in the voice classification area.</span>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call