Abstract

This paper presents the estimation of accuracy in male, female, and transgender identification using different classifiers with the help of voice signals. The recall value of each gender is also calculated. This paper reports the third gender (transgender) identification for the first time. Voice signals are the most appropriate and convenient way to transfer information between the subjects. Voice signal analysis is vital for accurate and fast identification of gender. The Mel Frequency Cepstral Coefficients (MFCCs) are used here as an extracted feature of the voice signals of the speakers. MFCCs are the most convenient and reliable feature that configures the gender identification system. Recurrent Neural Network–Bidirectional Long Short-Term Memory (RNN-BiLSTM), Support Vector Machine (SVM), and Linear Discriminant Analysis (LDA) are utilized as classifiers in this work. In the proposed models, the experimental result does not depend on the text of the speech, the language of the speakers, and the time duration of the voice samples. The experimental results are obtained by analyzing the common voice samples. In this article, the RNN-BiLSTM classifier has single-layer architecture, while SVM and LDA have a k-fold value of 5. The recall value of genders and accuracy of the proposed models also varied according to the number of voice samples in training and testing datasets. The highest accuracy for gender identification is found as 94.44%. The simulation results show that the accuracy of the RNN is always found at a higher value than SVM and LDA. The gender-wise highest recall value of the proposed model is 95.63%, 96.71%, and 97.22% for males, females, and transgender, respectively, using voice signals. The recall value of the transgender is high in comparison to other genders.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call