Abstract

Pathological speech, in its many forms, is a symptom of numerous serious diseases affecting millions of people worldwide, including more than 10 million Parkinson patients. Here, a powerful method is proposed for detecting pathological speech, using a two-dimensional (2D) convolutional neural network (CNN). Spectrograms are extracted from voice recordings of healthy and Parkinson diagnosed patients, which are fed into the CNN architecture. The voice samples comprise a subset of the benchmark mobile Parkinson Disease (mPower) study. The proposed model achieves 98% accuracy in Parkinson detection (i.e., a two-class problem). Moreover, an average accuracy exceeding 94% is measured in binary tests (i.e., pathological versus healthy) employing six voice pathologies conducted on the Saarbruecken Voice Database. These pathologies are dysphonia, functional dysphonia, hyperfunctional dysphonia, spasmodic dysphonia, vocal fold polyp, and dysody.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call