Abstract
This paper presents a novel approach to the design of a robust speaker recognition system. A noise-free synthesised spectrum is produced from a noisy spectrum. This synthesised spectrum is used for feature extraction. From noisy speech, the pitch is extracted using a robust pitch estimation algorithm. This also helps in identifying the voiced segments of speech which are the only ones considered in the synthesis. After estimating pitch, the noisy signal is sampled in the frequency domain at pitch harmonics. From the sampled data, a reconstruction procedure is suggested in this paper in order to generate a noise-free synthesised spectrum which retains the characteristics of the speaker but rejects the noisy contributions. We compare results with the original MFCC parameters and show that on a 100 speaker database, the MFCC parameters computed on the reconstructed spectrum consistently outperforms conventional MFCC parameters over a full range of noise levels under mismatched conditions, while maintaining comparable performance under matched conditions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.