Machine-Learning-Based Closed-Set Text-Independent Speaker Identification Using Speech Recorded During 25 Hours of Prolonged Wakefulness

Youngsun Kong,Hugo F Posada-Quintero,Jeffrey Bolkhovsky,Ki H Chon,Matthew S Daley

doi:10.1109/access.2021.3094175

Youngsun Kong, Hugo F Posada-Quintero + Show 3 more

Open Access

https://doi.org/10.1109/access.2021.3094175

Copy DOI

Abstract

We performed machine learning for text-independent speaker identification using speech recorded during the day, evening, and night, from subjects undergoing 25 hours of prolonged wakefulness. Subjects answered casual questions lasting approximately 3 minutes and described pictures presented to them for 0.5 minutes. We extracted 12,515 vocal features using OpenSmile software. For generalization of the training scheme, we segmented the 20 subjects into training and testing sets (10 subjects for each) and repeated testing four times with different subsets. Specifically, we used one set of 10 subjects to find the best feature-sets and the optimal machine-learning method, and the other set of 10 subjects was used to test the trained model. With trained machine-learning models using three speech sessions recorded throughout the day for speaker identification, we obtained 95% and 98.8% for balanced accuracies for daytime and evening speech, respectively, but 84.2% for nighttime-testing speech. With training data from all times of day-daytime, evening, and nighttime-we obtained 97.5%, 98.8%, and 98.1% for balanced accuracies for test data from daytime, evening, and nighttime speech, respectively; the overall accuracy was 98.1%. Prolonged wakefulness deteriorates the performance of machine-learning based speaker identification. This work suggests that machine-learning based speaker identification should be trained using speech data from both daytime and nighttime speech sessions for better overall accuracy. Machine learning can potentially be used for identifying a speaker's voice even when it is affected by tiredness and fatigue which are frequently encountered in scenarios such as the emergency rooms and long-duration repetitive task operations.

Highlights

Speaker identification is relevant for applications such as military operations, forensic speaker recognition, and phone customer service, among others [1], [2]
We evaluated the performance of the machine learning methods by calculating the balanced accuracy as follows: BBBBBBBBBBBBBBBB
By using two sessions for training the machine learning methods, all test sets showed more than 90% balanced accuracies

Summary

Introduction

Speaker identification is relevant for applications such as military operations, forensic speaker recognition, and phone customer service, among others [1], [2]. For these applications, speaker identification must be independent of the text being spoken, and there can be no reliance on emotional or situational context. Speaker identification must be independent of the text being spoken, and there can be no reliance on emotional or situational context This makes speech identification challenging, because external factors like stress, emotions, and fatigue can affect human speech [3]–[5].

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Machine-Learning-Based Closed-Set Text-Independent Speaker Identification Using Speech Recorded During 25 Hours of Prolonged Wakefulness

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Journal: IEEE Access	Publication Date: Jan 1, 2021
License type: CC BY 4.0

Similar Papers

Predicting seismic collapse probability of the building isolated with triple friction pendulums using machine learning
Yanqing Xu ... Ruijun Zhang
Structures | VOL. 58
Yanqing Xu, et. al.Yanqing Xu ... Ruijun Zhang
31 Oct 2023
Structures | VOL. 58

Text-independent speaker identification based on feature transformation to phoneme-independent subspace
Haoze Lu ... Shingo Kuroiwa
-
Haoze Lu, et. al. Haoze Lu ... Shingo Kuroiwa
01 Nov 2008
01 Nov 2008

Multi-expert and hybrid connectionist approach for pattern recognition: speaker identification task.
Younès Bennani
International journal of neural systems | VOL. 5
Younès BennaniYounès Bennani
01 Sep 1994
International journal of neural systems | VOL. 5

JukeBox: A Multilingual Singer Recognition Dataset
Anurag Chowdhury ... Arun Ross
-
Anurag Chowdhury, et. al.Anurag Chowdhury ... Arun Ross
25 Oct 2020
25 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Machine-Learning-Based Closed-Set Text-Independent Speaker Identification Using Speech Recorded During 25 Hours of Prolonged Wakefulness

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access