Abstract

Spotting of keywords in continuous speech signal with the aid of the computer is called a keyword spotting (KWS) system. A variety of strategies have been suggested in the literature to detect keywords from the adult’s speech effectively. However, only a limited number of studies have been reported for KWS in children’s speech. Due to the difference in physiological properties, the pitch and speaking rate of children’s differ from the adult’s. Consequently, KWS system model parameters trained on the speech data from adult’s signal yield poor performance for children speech. In this paper, we have developed a KWS system for spotting keywords from children’s speech using models trained on adults’ speech. The proposed approach uses spectral moment time–frequency distribution augmented by low-order cepstral (SMAC) as the front-end feature. The mismatches due to differences in pitch and speaking rate of children and adult speakers are further mitigated by data-augmented training using explicit pitch and speaking rate modifications. The experimental findings presented in this paper show that the SMAC feature offers significantly better output for both clean and noisy test conditions than the conventional Mel frequency cepstral coefficients.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call