Abstract

We present initial work towards development of a children's speech recognition system for use within an interactive reading and comprehension training system. We first describe the Colorado Literacy Tutor project and two corpora collected for children's speech recognition research. Next, baseline speech recognition experiments are performed to illustrate the degree of acoustic mismatch for children in grades K through 5. It is shown that an 11.2% relative reduction in word error rate can be achieved through vocal tract normalization applied to children's speech. Finally, we describe our baseline system for automatic recognition of spontaneously spoken story summaries. It is shown that a word error rate of 42.6% is achieved on the presented children's story summarization task after using unsupervised MAPLR (maximum a posteriori linear regression) adaptation and VTLN (vocal tract length normalization) to compensate for inter-speaker acoustic variability. Based on this result, we point to promising directions for further study.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call