Abstract
India is a multilingual society having more than 1600 languages. Most of these languages are having an overlapping set of phonemes. This makes developing language identification (LID) framework difficult for Indian languages. In this paper, the above challenge is addressed using phonetic features. To model the temporal variations in phonetic features, attention based residual-time delay neural network (RES-TDNN) is proposed. This network effectively captures long-range temporal dependencies through TDNN and attention mechanism. The proposed network has been evaluated on IIITH-ILSC database using phonetic and acoustic features. The database consists of 22 official Indian languages and Indian English. Attention based RES-TDNN outperformed the other state-of-the-art networks such as deep neural network, long short-term memory network and produced an equal error rate of 9.46%. Further, the fusion of shifted delta cepstral and phonetic features have improved the performance.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have