Transfer Learning in Speaker’s Age and Gender Recognition

Maxim Markitantov

doi:10.1007/978-3-030-60276-5_32

Abstract

In this paper, we study an application of transfer learning approach to speaker’s age and gender recognition task. Recently, speech analysis systems, which take images of log Mel-spectrograms or MFCCs as input for classification, are gaining popularity. Therefore, we used pretrained models that showed good performance on ImageNet task, such as AlexNet, VGG-16, ResNet18, ResNet34, ResNet50, as well as state-of-the-art EfficientNet-B4 from Google. Additionally, we trained 1D CNN and TDNN models for speaker’s age and gender recognition. We compared performance of these models in age (4 classes), gender (3 classes) and joint age and gender (7 classes) recognition. Despite high performance of pretrained models in ImageNet task, our TDNN models showed better UAR results in all tasks presented in this study: age (UAR = 51.719%), gender (UAR = 81.746%) and joint age and gender (UAR = 48.969%) recognition.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Transfer Learning in Speaker’s Age and Gender Recognition

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Human-Machine Interaction Personalization: a Review on Gender and Emotion Recognition Through Speech Analysis
Monica La Mura ... Patrizia Lamberti
-
Monica La Mura, et. al.Monica La Mura ... Patrizia Lamberti
01 Jun 2020
01 Jun 2020

Voice-based age, gender, and language recognition based on ResNet deep model and transfer learning in spectro-temporal domain
Samira Mavaddati
Neurocomputing | VOL. 580
Samira MavaddatiSamira Mavaddati
28 Feb 2024
Neurocomputing | VOL. 580

Eye-movement patterns during emotion recognition in focal epilepsy: An exploratory investigation
Birgitta Metternich ... Michael Schönenberg
Seizure: European Journal of Epilepsy | VOL. 100
Birgitta Metternich, et. al.Birgitta Metternich ... Michael Schönenberg
04 Jul 2022
Seizure: European Journal of Epilepsy | VOL. 100

Touchscreen gestures as images. A transfer learning approach for soft biometric traits recognition
Alfonso Guarino ... Nicola Lettieri
Expert Systems with Applications | VOL. 219
Alfonso Guarino, et. al.Alfonso Guarino ... Nicola Lettieri
03 Feb 2023
Expert Systems with Applications | VOL. 219

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Transfer Learning in Speaker’s Age and Gender Recognition

Abstract

Talk to us

Similar Papers