Multilingual Speech Corpus in Low-Resource Eastern and Northeastern Indian Languages for Speaker and Language Identification

Joyanta Basu,Soma Khan,Rajib Roy,Swanirbhar Majumder,Tapan Kumar Basu

doi:10.1007/s00034-021-01704-x

Abstract

Research and development of speech technology applications in low-resource languages (LRL) are challenging due to the non-availability of proper speech corpus. Especially, for most of the Indian languages, the amount and type of data found in different digital sources are sparse and prior works are too few to serve the purpose of large-scale development needs. This paper illustrates the creation process of such an LRL corpus comprising of sixteen rarely studied Eastern and Northeastern (E&NE) Indian languages and presents the data variability with different statistics. Furthermore, several experiments are carried out using the collected LRL corpus to build baseline speaker identification (SID) and language identification (LID) system for acceptance evaluation. For investigating the presence of speaker and language-specific information, spectral features like Mel frequency cepstral coefficients (MFCCs), shifted delta cepstral (SDC), and relative spectral transform-perceptual linear prediction (RASTA-PLP) features are used here. Vector quantization (VQ), Gaussian mixture models (GMMs), support vector machine (SVM), and multilayer perceptron (MLP)-based models are developed to represent the speaker and language-specific information captured through the spectral features. Apart from this, i-vectors, time delay neural networks (TDNN), and recurrent neural network with long short-term memory (LSTM-RNN) method-based SID and LID models are being experimented with to comply with the recent approaches. Performances of the developed systems are analyzed with LRL corpus in terms of SID and LID accuracy. The best SID and LID performances are observed to be 94.49% and 95.69%, respectively, for the baseline systems using LSTM-RNN with MFCC + SDC feature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multilingual Speech Corpus in Low-Resource Eastern and Northeastern Indian Languages for Speaker and Language Identification

Abstract

Talk to us

Similar Papers

More From: Circuits, Systems, and Signal Processing

Lead the way for us

Journal: Circuits, Systems, and Signal Processing	Publication Date: Apr 20, 2021
Citations: 12

Similar Papers

Speaker Identification in Spoken Language Mismatch Condition: An Experimental Study
Joyanta Basu ... Swanirbhar Majumder
-
Joyanta Basu, et. al.Joyanta Basu ... Swanirbhar Majumder
01 Jan 2020
01 Jan 2020

Performance Evaluation of Speaker Identification in Language and Emotion Mismatch Conditions on Eastern and North Eastern Low Resource Languages of India
Joyanta Basu ... Tapan Kumar Basu
-
Joyanta Basu, et. al.Joyanta Basu ... Tapan Kumar Basu
14 Nov 2021
14 Nov 2021

A hierarchical language identification system for Indian languages
S. Jothilakshmi ... V. Ramalingam
Digital Signal Processing | VOL. 22
S. Jothilakshmi, et. al.S. Jothilakshmi ... V. Ramalingam
27 Jan 2012
Digital Signal Processing | VOL. 22

Speaker-based language identification for Ethio-Semitic languages using CRNN and hybrid features
Malefia Demilie Melese ... Ibrahim Gashaw Kasa
Network: Computation in Neural Systems | VOL. ahead-of-print
Malefia Demilie Melese, et. al.Malefia Demilie Melese ... Ibrahim Gashaw Kasa
06 Jun 2024
Network: Computation in Neural Systems | VOL. ahead-of-print

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multilingual Speech Corpus in Low-Resource Eastern and Northeastern Indian Languages for Speaker and Language Identification

Abstract

Talk to us

Similar Papers

More From: Circuits, Systems, and Signal Processing