Abstract
This paper investigates the importance of spectrotemporal characteristics of the source excitation signal for speaker recognition. We propose an effective feature extraction technique for obtaining essential timefrequency information from the linear prediction (LP) residual signal, which are closely related to the glottal excitation of individual speaker. With pitch synchronous analysis, wavelet transform is applied to every two pitch cycles of the LP residual signal to generate a new feature vector, called Wavelet Octave Coefficients of Residues (WOCOR), which provides additional speaker discriminative power to the commonly used linear predictive Cepstral coefficients (LPCC). Experimental evaluation over a Cantonese speaker recognition corpus demonstrates the effectiveness of WOCOR for speaker recognition. Recognition tests with WOCOR and LPCC outperforms the conventional methods of using Mel Frequency Cepstral Coefficients (MFCC).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.