Abstract

Forensic Automatic Speaker Recognition (FASR) is a process for making judgments on whether a particular speech utterance belongs to a suspected speaker. This is a challenging task as the sample tested may be received from different channels and noise conditions or may be disguised. Robust feature extraction plays an important role in improving the performance of FASR. In this paper, a forensic automatic speaker recognition system is implemented which exploits Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) features. Performance of the system is demonstrated on a Malayalam speaker database. The speaker recognition framework is based on conventional i-vector based system. Experimental results suggest that the PNCC features provide slightly inferior performance with the MFCC features while tested under the conditions namely, masked speech, telephone speech, and Voice over Internet Protocol (VoIP). But, it is also observed that PNCC provides better performance than MFCC in noisy conditions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.