Temporal and spectral features based gender recognition from audio signals

E Priya,Padam Satya Reshma,Janani Priyadharshini S,Sashaank S

doi:10.1109/ic3iot53935.2022.9767929

Abstract

Gender is an important consideration when it comes to the overall development of society in terms of social norms and power structure impact on the society itself. Gender recognition is so useful in the fields of forensics and sports. In forensics, many criminal cases can be cracked if the gender of the criminal is known. If the gender of the convicted or suspect isn't sure and needs to be confirmed, tests are undertaken. This work aims to identify gender using audio signals. It presents an efficient technique that measures the voice features through which it classifies gender. The spectral features such as power spectral density, spectral centroid, spectral flux, spectral roll-off, and temporal features such as energy, zero-crossing rate, root mean square, maximum amplitude are acquired from the voice signal. Next, the features are scrutinized for statistical consistency using the t-test, and the classification based on the Mel spectral features plot is discussed. The discussion reveals that the best-suited features for spontaneous gender classification are the power spectral density and spectral flux. These two dominant features simplify the process of determining gender without any complex tests and lab setup during initial diagnosis.

Full Text