Abstract
Over the decade, mel-frequency cepstral coefficient (MFCC) has been the most popular feature extraction method in the field of automatic speaker recognition. But in case of robust speaker recognition system, its performance is good for white noise contamination but not as good for other noises. We introduce speech-signal-based frequency cepstral coefficients (SFCC) in speaker recognition domain. In this method, frequency warping function is derived directly from the speech signal itself by considering equal area portions of the logarithm of the ensemble average short-time power spectrum of entire speech corpus. Speech-signal-based frequency warping function is very much similar to the frequency scale obtained through psycho-acoustic experiments known as mel scale and bark scale. We have proposed to use combination of filter banks of both the MFCC and SFCC in text-independent speaker identification. Speaker identification experiments are performed on POLY-COST database. The proposed technique gives better performance than the single streamed MFCC or SFCC based features for robust speaker identification system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.