Abstract
Bahasa Indonesia is one of the most prominent low-resource Languages that still lack development in regards to communication-assisting technology. This paper proposes an improved system for generating transcript and identifying speakers from a concurrent speech in Bahasa Indonesia. The proposed method is applicable in a situation such as an online meeting and remote conference. The system combines Reinforced Learning (RL) Model with pitch-aware speech separation to identify the speakers in a concurrent speech. A Recurrent Neural Network (RNN) is utilized to generate the text transcript which is later improved by an external language model and spelling correction model. The proposed system was able to identify up to 5 speakers with a variable degree of confidence and generate a transcript for each of them with better quality compared to other methods when evaluated with several metrics. The result shows that the proposed method perform better compared to the baseline method, even in the single-speaker situation, and function in the simultaneous-speech situation, with an average Word Error Rate (WER) of 16.59% for two speakers, 26.72% for three speakers, and 31.50% for four speakers.
Highlights
Indonesian Language, or is often called Bahasa Indonesia is a unity language that belongs to Austronesian family formed from hundreds of local languages throughout the country
Speech Processing in Indonesian Language. In this we section we explore several researches in the field of speech processing in Indonesian Language and the approach that has been proposed to tackle the challenge in developing speech technology for low-resource language which found in Indonesian Language
To evaluate the proposed method, we mainly used word error rate (WER) and word accuracy which is widely used as the assessment metric for word and text recognition
Summary
Indonesian Language, or is often called Bahasa Indonesia is a unity language that belongs to Austronesian family formed from hundreds of local languages throughout the country. While it is formed from a wide variety of ethnic accents, often words share a similar pattern and meaning across many places. Bahasa Indonesia use the means of repetition of word It is considered as a member of agglutinative language family, meaning that it has wide range of prefixes and suffixes. According to [4] Bahasa Indonesia has 33 phenomes which consist of seven vowel phonemes, three diphthong and 23 consonant phonemes These phonemes are the standard phonemes used by Indonesians when uttering Indonesian words without considering their allophone. Modern Indonesia is written in Roman script that consist of 26 letters from ‘a’ to ‘z’
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have