Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.

Jianwei Zhang,Julie Liss,Suren Jayasuriya,Visar Berisha

doi:10.1109/taslp.2023.3261753

Abstract

Approximately 1.2% of the world's population has impaired voice production. As a result, automatic dysphonic voice detection has attracted considerable academic and clinical interest. However, existing methods for automated voice assessment often fail to generalize outside the training conditions or to other related applications. In this paper, we propose a deep learning framework for generating acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss is combined with a classification loss to train our deep learning model jointly. Data warping methods are used on input voice samples to improve the robustness of our method. Empirical results demonstrate that our method not only achieves high in-corpus and cross-corpus classification accuracy but also generates good embeddings sensitive to voice quality and robust across different corpora. We also compare our results against three baseline methods on clean and three variations of deteriorated in-corpus and cross-corpus datasets and demonstrate that the proposed model consistently outperforms the baseline methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing

Lead the way for us

Journal: IEEE/ACM Transactions on Audio, Speech, and Language Processing	Publication Date: Jan 1, 2023
Citations: 3

Similar Papers

Enhancing Multi-tissue and Multi-scale Cell Nuclei Segmentation with Deep Metric Learning
Tomas Iesmantas ... Kristina Sutiene
Applied Sciences | VOL. 10
Tomas Iesmantas, et. al.Tomas Iesmantas ... Kristina Sutiene
15 Jan 2020
Applied Sciences | VOL. 10

Deep Program Representation Learning Analysis for Program Security
Na Li ... Guang Kou
Journal of Physics: Conference Series | VOL. 1971
Na Li, et. al.Na Li ... Guang Kou
01 Jul 2021
Journal of Physics: Conference Series | VOL. 1971

Classification of small-scale hyperspectral images with multi-source deep transfer learning
Xin Zhao ... Fei Zhu
Remote Sensing Letters | VOL. 11
Xin Zhao, et. al.Xin Zhao ... Fei Zhu
05 Feb 2020
Remote Sensing Letters | VOL. 11

Early Alzheimer’s disease diagnosis with the contrastive loss using paired structural MRIs
Hezhe Qiao ... Fan Zhu
Computer Methods and Programs in Biomedicine | VOL. 208
Hezhe Qiao, et. al.Hezhe Qiao ... Fan Zhu
22 Jul 2021
Computer Methods and Programs in Biomedicine | VOL. 208

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.

Abstract

Talk to us

Similar Papers

More From: IEEE/ACM Transactions on Audio, Speech, and Language Processing