Novel voiceprint using ensembled Mel-Chromagram for speaker recognition

K Banuroopa,D Shanmuga Priyaa

doi:10.53730/ijhs.v6ns4.10404

K Banuroopa, D Shanmuga Priyaa

Open Access

https://doi.org/10.53730/ijhs.v6ns4.10404

Copy DOI

Abstract

This research paper proposes a novel voiceprint generation methodology for recognizing the speakers registered in a system. The proposed methodology is a keyword-dependent closed set speaker classification task. The features used are Mel-Spectrogram, Chromagram, MFCC and a new ensembled feature called Mel-Chroma. Mel-Chroma is generated with the combination of Mel-spectrogram and Chromagram. The Mel-Chroma spectrogram generated is converted into a binary image by using the average as the threshold. The recurrent neural network model LSTM is used for the classification task and the dataset used is FSDD. The proposed method has a higher accuracy than the state-of-art methods for the specific task. The accuracy obtained for the classification of speakers using a binary Mel-Chroma voiceprint is 98.33%.

Full Text