Abstract

ABSTRACT Speech corpus being the basic requirement for the development of Automatic speech recognition (ASR) system, it should be done with much accuracy in order to enhance the performance of the system. This paper describes the proposed procedure to abide while collecting the speech corpus of Swahili language from the native and non native speaker for the development of Automatic Speech Recognition system in Swahili language. General Terms Speech Database, Speech Recognition, Natural Language Processing, Human Computer Interaction Keywords Swahili, Swahili Text corpus, Phonetics, Text Corpus and Speech Corpus, Automatic Speech Recognition 1. INTRODUCTION Speech is the most prominent and natural form of communication between humans. Communication among the human being is dominated by spoken languages; therefore the researchers are trying to build the speech interfaces for computer [1]. Speech has potential of being used as a mode of interaction with computer. Human beings have long been motivated to create systems that can understand and talk like human. In this direction, researchers have tried to develop system for analysis and classification of the speech signals. Since, 1960s computer scientists have been researching ways and means to make computer record, interpret and understand human speech. Speech processing has become increasingly important in daily life, as the number of web enabled mobile phone users in rural as well as in urban area is increasing. Most of the researches efforts are in the field are of natural language processing (NLP) for African Language where Swahili is included and has majorly rooted in the rule based paradigm. The rule based approach has some merits as well as demerits. The merit of Swahili is in term of its design transparency and the demerits are that it’s highly language dependency and costly to develop as it typically involves a lot of manual effort of experts from Natural Language Processing field. unlike adaptations of the Arabic script for other languages, The system are decidedly competence based which is often tweaked and tuned towards a small sets of ideal sample words or sentences neglecting the real-world language technology application. In Language technologies for many African languages the researchers are getting tired of publication on real-world data or reports. Currently with the increased need of digital resource usage in the continent of Africa, there is a great need for more empirical approaches such as data driven and corpus based approach for language technologies. The main advantages of these approaches are: language impendence, development speed, robustness and empiricism. There is scarcity of sources in the sense that the digital text resources are few. The recent effort on the same is handled carefully with selected procedure for Swahili [2, 3]. For language technology applications such as speech recognition system, text-to-speech synthesis, machine aided translation and web related issues there is a great need for translation and usability of the Swahili language. There is a great need of work to be done in semantics and syntactic of Swahili language as the biggest online web Text resources which are available on Google, Yahoo and Wikipedia are not that correct. The major need is the extraction of information which enhances and refocuses on embarking on Swahili as a language, the corpus availability needs to be syntactically & semantically correct. This paper focuses on the procedure to be followed for the development of isolated numeric speech corpus. The information about Swahili language is described in the Section 2. Section 3 describes the Swahili text Corpus selected. Section 4 describes the procedure to be followed for developing the speech corpus. Section 5 describes the recording procedure to be followed. The Conclusion and the Future work are discussed in Section 6.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call