Abstract

A non-invasive cum robust voice pathology detection and classification architecture is proposed in the current manuscript. In place of the conventional feature-based machine learning techniques, a new architecture is proposed herein which initially performs deep learning-based filtering of the input voice signal, followed by a decision-level fusion of deep learning and a non-parametric learner. The efficacy of the proposed technique is verified by performing a comparative study with very recent work on the same dataset but based on different training algorithms.The proposed architecture has five different stages.The results are recorded in terms of nine (9) different classification score indices which are – mean average Precision, sensitivity, specificity, F1 score, accuracy, error, false-positive rate, Matthews Correlation Coefficient, and the Cohen’s Kappa index. The experimental results have shown that the use of machine learning classifier can get at most 96.12% accuracy, while the proposed technique achieved the highest accuracy of 99.14% in comparison to other techniques.

Highlights

  • Speech is the most basic form of communications known between two groups of living entities, including human beings, animals, and/or birds

  • The adaptive moment estimation (ADAM) optimizer was used asit was more compatible with a recurrent neural network (RNN) than the stochastic gradient descent (SGD) optimizer

  • This work investigates improving the accuracy of the diagnosis of voice pathology in search of more robust solutions

Read more

Summary

Introduction

Speech is the most basic form of communications known between two groups of living entities, including human beings, animals, and/or birds. Speech production requires an articulate and systematic functioning of some vital organs with these organs responsible for three primary mechanisms, including creating air pressure, providing vibration, and resonance. The muscles in the abdomen and chest, together with the rib cage, diaphragms, and lungs are collectively responsible to provide an articulate air pressure mechanism which shakes the vocal folds repeatedly resulting in a “pitch”. The characteristics of the vocal cords and tract in turn influence the sound that the human speech system produces. Via this process, a unique voice, representing a signature of these internal organ structures, is produced; each individual on this planet has a distinct voice. The pitch, which is a frequency of the wave produced by the mucosa, determines whether a person’s laryngeal functioning is normal or not

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.