Abstract

Objectives The current study presents a clinical evaluation of Vox4Health, an m-health system able to estimate the possible presence of a voice disorder by calculating and analyzing the main acoustic measures required for the acoustic analysis, namely, the Fundamental Frequency, jitter, shimmer, and Harmonic to Noise Ratio. The acoustic analysis is an objective, effective, and noninvasive tool used in clinical practice to perform a quantitative evaluation of voice quality. Materials and Methods A clinical study was carried out in collaboration with medical staff of the University of Naples Federico II. 208 volunteers were recruited (mean age, 44.2 ± 13.9 years), 58 healthy subjects (mean age, 36.7 ± 13.3 years) and 150 pathological ones (mean age, 47 ± 13.1 years). The evaluation of Vox4Health was made in terms of classification performance, i.e., sensitivity, specificity, and accuracy, by using a rule-based algorithm that considers the most characteristic acoustic parameters to classify if the voice is healthy or pathological. The performance has been compared with that achieved by using Praat, one of the most commonly used tools in clinical practice. Results Using a rule-based algorithm, the best accuracy in the detection of voice disorders, 72.6%, was obtained by using the jitter or shimmer value. Moreover, the best sensitivity is about 96% and it was always obtained by using jitter. Finally, the best specificity was achieved by using the Fundamental Frequency and it is equal to 56.9%. Additionally, in order to improve the classification accuracy of the next version of the Vox4Health app, an evaluation by using machine learning techniques was conducted. We performed some preliminary tests adopting different machine learning techniques able to classify the voice as healthy or pathological. The best accuracy (77.4%) was obtained by the Logistic Model Tree algorithm, while the best sensitivity (99.3%) was achieved using the Support Vector Machine. Finally, Instance-based Learning performed the best specificity (36.2%). Conclusions Considering the achieved accuracy, Vox4Health has been considered by the medical experts as a “good screening tool” for the detection of voice disorders in its current version. However, this accuracy is improved when machine learning classifiers are considered rather than the rule-based algorithm.

Highlights

  • Voice signals are sounds produced by air pressure vibrations exhaled from the lungs and modulated and shaped by the vibrations of the vocal folds and the resonance of the vocal tract

  • To classify a voice as pathological or healthy, we evaluated the four voice features estimated by the app, namely, the F0, jitter, shimmer, and Harmonic to Noise Ratio (HNR), by using IF/ rules, and we compared the results obtained with those achieved by Praat, one of main systems currently used in clinical practice, by using the same IF/ rules

  • The Logistic Model Tree (LMT) achieved an accuracy of about 77% while the best accuracy obtained with single acoustic parameters was achieved by jitter and shimmer (72%)

Read more

Summary

Introduction

Voice signals are sounds produced by air pressure vibrations exhaled from the lungs and modulated and shaped by the vibrations of the vocal folds and the resonance of the vocal tract. The physiological process that leads to the production of the voice involves several structures, such as (i) the respiratory system, the main component that influences the intensity of the voice thanks to modulation of an expiratory flow with a variable pressure below the vocal folds;. There are the auditory and central nervous systems The former plays an important role in regulating the intensity of the voice interacting with the central nervous system that participates in the management of several mechanisms involved in the production of the voice, such as breathing or pneumophonic coordination [1]. Vocal abuse or incorrect lifestyle habits, such as smoking or alcohol abuse, constitute risk factors for the development of the disorder

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.