Abstract

One of the most challenging issues in Speaker's Gender Classification (SGC) is feature extraction. Since, it degrades the classification accuracy due to information loss during features extraction using speech signals. In previous researches, Perceptual Linear Prediction (PLP) coefficients were extracted by using Blackman windowing method along with the other features of speech signal to improve the classification accuracy. However, still some information was lost at those window edges which degrade the recognition accuracy and also more efficient features were required to improve the classification performance. Hence in this paper, SGC is improved by extracting the PLP coefficients based on novel windowing technique. In this technique, initially type-1 features such as spectral and prosodic features of speech signal are extracted. In addition, Information Preserving Perceptual Linear Prediction (IPPLP) coefficients are also extracted using Slepian windowing method. Moreover, the frequency-dependent transmission characteristics of the outer ear are compensated based on the analysis of time-varying Equal Loudness Contour (ELC) curves and Peak-to-Loudness Ratio (PLR). After that, the extracted IPPLP features are fused with type-1 features and classified by using different combinations of classifiers like Gaussian Mixture Model (GMM), Support Vector Machine (SVM) and GMM supervectors-based SVM at score level fusion scheme. According to the final classification result, the type of speaker's gender is recognized. Finally, the experimental results show the significant improvements on classification accuracy by using proposed classification technique. With the proposed speaker's gender classification technique, the classification accuracy values are obtained 38.55%, 62.65% and 69.88% in GMM, SVM and GMM-SVM classification, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call