Abstract
Voice pathology detection is a rapidly evolving field of scientific research focused on the identification and diagnosis of voice disorders. Early detection and diagnosis of these disorders is critical, as it increases the likelihood of effective treatment and reduces the burden on medical professionals. The objective of this scientific paper is to develop a comprehensive model that utilizes various deep learning techniques to improve the detection of voice pathology. To achieve this, the paper employs several techniques to extract a set of sensitive features from the original voice signal by analyzing the time-frequency characteristics of the signal. In this regard, as a means of extracting these features, a state-of-the-art approach combining Gammatonegram features with Scalogram Teager_Kaiser Energy Operator (TKEO) features is proposed, and the proposed feature extraction scheme is named Combine Gammatonegram with (TKEO) Scalogram (CGT Scalogram). In this study, ResNet deep learning is used to recognize healthy voices from pathological voices. To evaluate the performance of the proposed model, it is trained and tested using the Saarbrucken voice database. In the end, the proposed system yielded impressive results with an accuracy of 96%, a precision of 96.3%, and a recall of 96.1% for binary classification and an accuracy of 94.4%, a precision of 94.5%, and a recall of 94% for multi-class. The results of the experiments demonstrate the effectiveness of the feature selection technique in maximizing the prediction accuracy in both binary and multi-class classifications.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have