Automatic Speech Recognition (ASR) is the use of computer hardware and software-based techniques to identify and process human voices. These systems used data from both male and female speakers. The majority of commercial ASR systems available on adult speech are working efficiently. Speech data collected from both male and female speakers were used in these systems. In recent decades, ASR systems for children have emerged, such as reading tutors, aids for foreign language learning, and computer games. Child ASR systems are essential but poorly understood in the field of computer speech recognition. The child data collection is a very complex task. Child corpus is not available publicly, and variability of children speakers and ASR developed for a particular age group is not suitable for other age groups. These are some of the reasons for less and ineffective child ASR systems. However, the non-availability of child corpus publicly is a primary reason for ineffective child ASR. Designing and developing child corpus is a very tedious task. Therefore, the primary focus of this state-of-the-art review is to discuss various challenges encountered while designing and developing child corpus.
Read full abstract