Automatic Voice Disorder Detection Using Self-Supervised Representations

Dayana Ribas,Eduardo Lleida,Miguel A Pastor,Alfonso Ortega,David Martinez,Antonio Miguel

doi:10.1109/access.2023.3243986

Dayana Ribas, Eduardo Lleida + Show 4 more

Open Access

https://doi.org/10.1109/access.2023.3243986

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 9	License type: CC BY 4.0

Affiliation: Universidad de Zaragoza

Abstract

Many speech features and models, including Deep Neural Networks (DNN), are used for classification tasks between healthy and pathological speech with the Saarbruecken Voice Database (SVD). However, accuracy values of 80.71% for phrases or 82.8% for vowels /aiu/ are the highest reported for audio samples in SVD when the evaluation includes the wide amount of pathologies in the database, instead of a selection of some pathologies. This paper targets this top performance in the state-of-the-art Automatic Voice Disorder Detection (AVDD) systems. In the framework of a DNN-based AVDD system we study the capability of Self-Supervised (SS) representation learning for describing discriminative cues between healthy and pathological speech. The system processes the SS temporal sequence of features with a single feed-forward layer and Class-Token (CT) Transformer for obtaining the classification between healthy and pathological speech. Furthermore, there is evaluated a suitable data extension of the training set with out-of-domain data is also evaluated to deal with the low availability of data for using DNN-based models in voice pathology detection. Experimental results using audio samples corresponding to phrases in the SVD dataset, including all pathologies available, show classification accuracy values until 93.36%. This means that the proposed AVDD system achieved accuracy improvements of 4.1% without the training data extension, and 15.62% after the training data extension compared to the baseline system. Beyond the novelty of using SS representations for AVDD, the fact of obtaining accuracies over 90% in these conditions and using the whole set of pathologies in the SVD is a milestone for voice disorder-related research. Furthermore, the study on the amount of in-domain data in the training set related to the system performance show guidance for the data preparation stage. Lessons learned in this work suggest guidelines for taking advantage of DNN, to boost the performance in developing automatic systems for diagnosis, treatment, and monitoring of voice pathologies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Voice Disorder Detection Using Self-Supervised Representations

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Multi-modal voice pathology detection architecture based on deep and handcrafted feature fusion
Asli Nur Omeroglu ... Emin Argun Oral
Engineering Science and Technology, an International Journal | VOL. 36
Asli Nur Omeroglu, et. al.Asli Nur Omeroglu ... Emin Argun Oral
01 Apr 2022
Engineering Science and Technology, an International Journal | VOL. 36

Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions
Ahmed Al-Nasheri ... Zulfiqar Ali
Journal of Voice | VOL. 31
Ahmed Al-Nasheri, et. al.Ahmed Al-Nasheri ... Zulfiqar Ali
15 Mar 2016
Journal of Voice | VOL. 31

MMHFNet: Multi-modal and multi-layer hybrid fusion network for voice pathology detection
Hussein M.A Mohammed ... Emin Argun Oral
Expert Systems with Applications | VOL. 223
Hussein M.A Mohammed, et. al.Hussein M.A Mohammed ... Emin Argun Oral
14 Mar 2023
Expert Systems with Applications | VOL. 223

A review on voice pathology: Taxonomy, diagnosis, medical procedures and detection techniques, open challenges, limitations, and recommendations for future directions
Nuha Qais Abdulmajeed ... Mazin Abed Mohammed
Journal of Intelligent Systems | VOL. 31
Nuha Qais Abdulmajeed, et. al.Nuha Qais Abdulmajeed ... Mazin Abed Mohammed
11 Jul 2022
Journal of Intelligent Systems | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Voice Disorder Detection Using Self-Supervised Representations

Abstract

Talk to us

Similar Papers

More From: IEEE Access