Pathological Voices Research Articles

The electroglottogram (EGG) is a signal used for measuring the change of the relative contact area in the vocal cord during the throat production. In the recent years, the low cost and the non-invasive applications have been derived. Hence, the EGG has been applied in various science, engineering and medical fields such as in the basic voice science including the phonetics, the singing and the hearing as well as in the speech and the language therapy and the related clinical works including the voice production physiology, the swallowing and the psychology. However, the pathological classifications using the EGGs usually yield the poor performances. This is because the EGGs are required to decompose into the various components for extracting the features for performing the classifications. Nevertheless, the total numbers of the components decomposed by some time frequrncy representation such as the empirical mode decomposition (EMD) for different EGGs are different. Hence, the dimension of the feature vectors extracted from different EGGs is different. This introduces to the difficulty for building a machine learning model for performing the classification. This paper is to address this issue. This paper proposes a method for grouping the intrinsic mode functions (IMFs) and the residue obtained by applying the EMD to the EGGs for classifying between the healthy subjects and the pathological subjects. More precisely, this paper proposes a clustering based method to group the IMFs and the residue so that the total numbers of the grouped IMFs of different EGGs are the same. First, the IMFs and the residue of the EGGs are catergorized into a desired number of groups based on their correlation coefficients. Second, the IMFs or the residue in each group are summed together to obtain the grouped IMF. Third, the mean frequency and the first formant of each grouped IMF are computed. Finally, a random forest is employed for performing the classification. To our best knowledge, this joint EMD and clustering based method is firstly proposed to preform the pathological voice detection. The computer numerical simulations are conducted using the online availiable Saarbrücken voice database. Here, five cross validations have been performed. The mean accuracy, the mean specificity and the mean sensitivity among these five validations are 86.98, 79.92 and 91.57, respectively. The standard deviation of the accuracy, the specificity and the sensitivity among these five validations are ±2.00%, ±3.71% and ±2.13%, respectively. The simulation results show that our proposed method outperforms the common EGG or speech processing based methods. This paper proposes a clustering based method for grouping the IMFs and the residue for performing the pathological classifications via the EGGs. The grouping criterion is based on the correlation coefficients. It is found that our proposed method can achieve the highest classifications for the majority signal to noise ratios compared to the state of the arts methods. • This paper proposes a clustering based method for grouping the IMFs and the residue. • It applies to classify between the healthy and the pathological subjects. • The total numbers of the grouped IMFs of different EGGs are the same. • The mean frequency and the first formant of each grouped IMF are computed. • A random forest is employed for performing the classification.

Read full abstract

BackgroundA multidimensional voice quality assessment is recommended for all patients with dysphonia, which requires a patient visit to the otolaryngology clinic. The aim of this study was to determine the accuracy of an online artificial intelligence classifier, the Online Sequential Extreme Learning Machine (OSELM), in detecting voice pathology. In this study, a Malaysian Voice Pathology Database (MVPD), which is the first Malaysian voice database, was created and tested.MethodsThe study included 382 participants (252 normal voices and 130 dysphonic voices) in the proposed database MVPD. Complete data were obtained for both groups, including voice samples, laryngostroboscopy videos, and acoustic analysis. The diagnoses of patients with dysphonia were obtained. Each voice sample was anonymized using a code that was specific to each individual and stored in the MVPD. These voice samples were used to train and test the proposed OSELM algorithm. The performance of OSELM was evaluated and compared with other classifiers in terms of the accuracy, sensitivity, and specificity of detecting and differentiating dysphonic voices.ResultsThe accuracy, sensitivity, and specificity of OSELM in detecting normal and dysphonic voices were 90%, 98%, and 73%, respectively. The classifier differentiated between structural and non-structural vocal fold pathology with accuracy, sensitivity, and specificity of 84%, 89%, and 88%, respectively, while it differentiated between malignant and benign lesions with an accuracy, sensitivity, and specificity of 92%, 100%, and 58%, respectively. Compared to other classifiers, OSELM showed superior accuracy and sensitivity in detecting dysphonic voices, differentiating structural versus non-structural vocal fold pathology, and between malignant and benign voice pathology.ConclusionThe OSELM algorithm exhibited the highest accuracy and sensitivity compared to other classifiers in detecting voice pathology, classifying between malignant and benign lesions, and differentiating between structural and non-structural vocal pathology. Hence, it is a promising artificial intelligence that supports an online application to be used as a screening tool to encourage people to seek medical consultation early for a definitive diagnosis of voice pathology.

Read full abstract

Pathological Voices Research Articles

Related Topics

Articles published on Pathological Voices

Grouping Intrinsic Mode Functions and Residue for Pathological Classifications via Electroglottograms

Quantitative acoustical analysis of genetic syndromes in the number listing task

MMHFNet: Multi-modal and multi-layer hybrid fusion network for voice pathology detection

An Efficient SMOTE-Based Deep Learning Model for Voice Pathology Detection

Acoustic speech parameter relationships with voice disorders and phrase differences

Exploring speech characteristics for automatic pathological voice detection

Determination of Harmonic Parameters in Pathological Voices—Efficient Algorithm

Voice Pathology Detection Using a Two-Level Classifier Based on Combined CNN–RNN Architecture

Combined Use of Nonlinear Measures for Analyzing Pathological Voices

Quest for Speech Enhancement Method in the Analysis of Pathological Voices

A Preliminary Investigation of the Reliability of Acoustic Parameters of Voice through Smartphone Recordings in Individuals with Dysphonia.

The accuracy of an Online Sequential Extreme Learning Machine in detecting voice pathology using the Malaysian Voice Pathology Database

Pathological voice classification based on multi-domain features and deep hierarchical extreme learning machine

A Highly Accurate Dysphonia Detection System Using Linear Discriminant Analysis

Automatic Voice Disorder Detection Using Self-Supervised Representations

Enhancing the Performance of Pathological Voice Quality Assessment System Through the Attention-Mechanism Based Neural Network

A Novel Voice Feature AVA and its Application to the Pathological Voice Detection Through Machine Learning

Hierarchical Multi-Class Classification of Voice Disorders Using Self-Supervised Models and Glottal Features

A mini-review of pathological voice recognition

A Modular Deep Learning Architecture for Voice Pathology Classification

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Pathological Voices Research Articles

Related Topics

Articles published on Pathological Voices

Grouping Intrinsic Mode Functions and Residue for Pathological Classifications via Electroglottograms

Quantitative acoustical analysis of genetic syndromes in the number listing task

MMHFNet: Multi-modal and multi-layer hybrid fusion network for voice pathology detection

An Efficient SMOTE-Based Deep Learning Model for Voice Pathology Detection

Acoustic speech parameter relationships with voice disorders and phrase differences

Exploring speech characteristics for automatic pathological voice detection

Determination of Harmonic Parameters in Pathological Voices—Efficient Algorithm

Voice Pathology Detection Using a Two-Level Classifier Based on Combined CNN–RNN Architecture

Combined Use of Nonlinear Measures for Analyzing Pathological Voices

Quest for Speech Enhancement Method in the Analysis of Pathological Voices

A Preliminary Investigation of the Reliability of Acoustic Parameters of Voice through Smartphone Recordings in Individuals with Dysphonia.

The accuracy of an Online Sequential Extreme Learning Machine in detecting voice pathology using the Malaysian Voice Pathology Database

Pathological voice classification based on multi-domain features and deep hierarchical extreme learning machine

A Highly Accurate Dysphonia Detection System Using Linear Discriminant Analysis

Automatic Voice Disorder Detection Using Self-Supervised Representations

Enhancing the Performance of Pathological Voice Quality Assessment System Through the Attention-Mechanism Based Neural Network

A Novel Voice Feature AVA and its Application to the Pathological Voice Detection Through Machine Learning

Hierarchical Multi-Class Classification of Voice Disorders Using Self-Supervised Models and Glottal Features

A mini-review of pathological voice recognition

A Modular Deep Learning Architecture for Voice Pathology Classification