Pathological voice detection and binary classification using MPEG-7 audio features

Ghulam Muhammad,Moutasem Melhem

doi:10.1016/j.bspc.2014.02.001

Abstract

ObjectivesA pathological voice detection and classification method based on MPEG-7 audio low-level features is proposed in this paper. MPEG-7 features are originally used for multimedia indexing, which includes both video and audio. Indexing is related to event detection, and as pathological voice is a separate event than normal voice, we show that MPEG-7 part-4 audio low-level features can do very well in detecting pathological voices, as well as binary classifying the pathologies. Patients and methodsThe experiments are done on a subset of sustained vowel (“AH”) recordings from healthy and voice pathological subjects, from the Massachusetts Eye and Ear Infirmary (MEEI) database. For classification, support vector machine (SVM) is applied. An optional feature selection method, namely, Fisher discrimination ratio is applied. ResultsThe proposed method with MPEG-7 audio features and SVM classification is evaluated on voice pathology detection, as well as binary pathologies classification. The proposed method is able to achieve an accuracy of 99.994% with a standard deviation of 0.0105% for detecting pathological voices and an accuracy up to 100% for binary pathologies classification. ConclusionMPEG-7 descriptors can reliably be used for automatic voice pathology detection and classification.

Full Text