Abstract

In present work, the robustness of excitation source features has been analyzed for language identification (LID) task. The raw samples of linear prediction (LP) residual signal, its magnitude and phase components are processed at sub-segmental, segmental and supra-segmental levels for capturing the robust language-specific phonotactic information. Present LID study has been carried out on 27 Indian languages from Indian Institute of Technology Kharagpur-Multi Lingual Indian Language Speech Corpus (IITKGP-MLILSC). Gaussian mixture models are used to develop the LID systems using robust language-specific excitation source information. Robustness of excitation source information has been evinced in view of (i) background noise, (ii) varying amount of training data and (iii) varying length of test samples. Finally, the robustness of proposed excitation source features is compared with the well-known spectral features using LID performances obtained from IITKGP-MLILSC database. Segmental level excitation source features obtained from raw samples of LP residual signal and its phase component perform better at low SNR levels, compared with the vocal tract features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call