Early diagnosis and referral are crucial in the treatment of voice disorders. Contemporary investigations have indicated the efficacy of voice pathology detection systems in significantly contributing to the evaluation of voice disorders, facilitating early diagnosis of such pathologies. These systems leverage machine learning methodologies, widely applied across diverse domains, and exhibit particular potential in the realm of voice pathology classification. However, machine learning models and performance metrics employed in these studies vary significantly, making it challenging to determine the optimal model for voice pathology classification. In this study, healthy and pathological voices were classified with state-of-the-art machine learning models, and the performance results of the models were compared. The voice samples employed in our research were sourced from the Saarbrücken Voice Database, a reputable German database. Feature extraction from voice signals was conducted using the Mel Frequency Cepstral Coefficients method. To assess and enhance the models' performance adequately, we employed hyperparameter optimization and implemented a 10-fold cross-validation approach. The outcomes revealed that the support vector machine model exhibited the highest accuracy, achieving 99.19% and 99.50% accuracies in the classification of male and female voice pathologies, respectively.
Read full abstract