Comparative analysis of machine learning algorithms on gender classification using Hindi speech data

Kanika Gupta,Arun Sharma,A K Mohapatra

doi:10.1201/9781003150664-40

Abstract

A speech signal consists of salient paralinguistic details, which include the language, gender, age, and emotion of the speaker. This information from speech signals has various practical applications in crimeanalysis, security & monitoring, and brain-computer interface. The primary and most important step for these applications is to identify gender using speech data. Preprocessing plays a crucial role in the development of speech identification system due to the presence of background noise. In this study, we have compared the performance of four Machine Learning (ML) algorithms to classify gender using Hindi speech data. The data was pre-processed using Speech Endpoint Detection (SED) algorithm and windowing process from which the Mel-Frequency Cepstrum Coefficients (MFCC) and Pitch Range features were extracted. It was observed that the Random Forest algorithm outperformed other ML classification algorithms – Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), and Logistic regression and achieved an accuracy of 78.84%.

Full Text