A Pathological Multi-Vowels Recognition Algorithm Based on LSP Feature

Tao Zhang,Yaqin Wu,Yanzhang Geng,Ganjun Liu,Mingyang Shi,Yangyang Shao

doi:10.1109/access.2019.2911314

Tao Zhang, Yaqin Wu + Show 4 more

Open Access

PDF Available

https://doi.org/10.1109/access.2019.2911314

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 16	License type: cc-by-nc-nd

Affiliation: Tianjin University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

At present, pathological voice recognition is mainly based on the classification of pathological voice. However, almost all the researches are based on the single vowel \a samples, but few on multi-vowels. In addition, the current researches on multi-vowels recognition are mainly for normal voices, which are unsuitable for the speech recognition of normal and pathological multi-vowels simultaneously. This paper concentrates on developing an accurate and robust feature called enhanced-bark line spectrum pair (E-BLSP) to detect and classify normal and pathological multi-vowels. We explore the impact of E-BLSP feature on recognition performance and propose an effective method based on the combination of three features including E-BLSP for pathological and normal multi-vowels. In this paper, first LSP and difference of adjacent LSP (DAL) features of a vowel are extracted. Then LSP feature is warped at bark domain to get bark line spectrum pair (BLSP). In addition, then E-BLSP feature is calculated by adjusting BLSP using DAL feature. Finally, the adjusted E-BLSP feature and other two traditional features, including linear prediction cepstrum coefficient (LPCC) and mel-frequency cepstrum coefficients (MFCC) are applied to support vector machine (SVM) and deep neural network (DNN) classifiers to explore the classification performance of single feature and feature combinations for pathological and normal vowels /a/, /i/ and /u/. The results show that the highest achieved accuracies for DNN and SVM network are 98.6190% and 96.2693%, while the largest achieved area under curves (AUC) are 0.9925 and 0.9868, correspondingly with the combination of three features including LPCC, MFCC, and E-BLSP.

Full Text