Abstract

Forensic Automatic Speaker Recognition (FASR) is a process for making judgments on whether a particular speech utterance belongs to a suspected speaker. This is a challenging task as the sample tested may be received from different channels and noise conditions or may be disguised. Robust feature extraction plays an important role in improving the performance of FASR. In this paper, a forensic automatic speaker recognition system is implemented which exploits Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) features. Performance of the system is demonstrated on a Malayalam speaker database. The speaker recognition framework is based on conventional i-vector based system. Experimental results suggest that the PNCC features provide slightly inferior performance with the MFCC features while tested under the conditions namely, masked speech, telephone speech, and Voice over Internet Protocol (VoIP). But, it is also observed that PNCC provides better performance than MFCC in noisy conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call