Abstract

The present work investigates the robustness of Power Normalized Cepstral Coefficients (PNCC) for Language identification (LID) from noisy speech. Though the state of the art vocal tract features like mel frequency cepstral coefficients (MFCC) give good recognition accuracy in clean environments, the performance degrades drastically when the signal to noise ratio decreases. In this work, experiments have been carried out on IITKGP-MLILSC speech database. Gaussian mixture model (GMM) is used to building the language models. We have used NOISEX-92 database to add synthetic noise at different SNR levels. We have also compared the recognition accuracy of two systems, one developed using MFCCs and and the other using PNCCs. Finally, we have shown that PNCC features are more robust to noise.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.