Leveraging laryngograph data for robust voicing detection in speech.

Yixuan Zhang,Heming Wang,Deliang Wang

doi:10.1121/10.0034445

Abstract

Accurately detecting voiced intervals in speech signals is a critical step in pitch tracking and has numerous applications. While conventional signal processing methods and deep learning algorithms have been proposed for this task, their need to fine-tune threshold parameters for different datasets and limited generalization restrict their utility in real-world applications. To address these challenges, this study proposes a supervised voicing detection model that leverages recorded laryngograph data. The model, adapted from a recently developed CrossNet architecture, is trained using reference voicing decisions derived from laryngograph datasets. Pretraining is also investigated to improve the generalization ability of the model. The proposed model produces robust voicing detection results, outperforming other strong baseline methods, and generalizes well to unseen datasets. The source code of the proposed model with pretraining is provided along with the list of used laryngograph datasets to facilitate further research in this area.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Leveraging laryngograph data for robust voicing detection in speech.

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America

Lead the way for us

Similar Papers

Automated Room Occupancy Prediction Using Fuzzy-Rough Set Theory-Based Supervised Learning
Surendra Nath Bhagat ... Anirban Mitra
-
Surendra Nath Bhagat, et. al.Surendra Nath Bhagat ... Anirban Mitra
29 Nov 2022
29 Nov 2022

Deep learning vs conventional learning algorithms for clinical prediction in Crohn's disease: A proof-of-concept study.
Danny Con ... Abhinav Vasudevan
World journal of gastroenterology | VOL. 27
Danny Con, et. al.Danny Con ... Abhinav Vasudevan
14 Oct 2021
World journal of gastroenterology | VOL. 27

Weakly-supervised deep learning for ultrasound diagnosis of breast cancer
Jaeil Kim ... Hye Won Kim
Scientific Reports | VOL. 11
Jaeil Kim, et. al.Jaeil Kim ... Hye Won Kim
01 Dec 2021
Scientific Reports | VOL. 11

MLNET: An Adaptive Multiple Receptive-Field Attention Neural Network for Voice Activity Detection
Zhenpeng Zheng ... Jian Luo
-
Zhenpeng Zheng, et. al.Zhenpeng Zheng ... Jian Luo
25 Oct 2020
25 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Leveraging laryngograph data for robust voicing detection in speech.

Abstract

Talk to us

Similar Papers

More From: The Journal of the Acoustical Society of America