Abstract

The automatic recognition of children's speech in acoustically mismatched conditions is a challenging problem on account of large difference in adults' and children's speech. In literature, this challenge is often addressed through concatenation of various feature/model domain adaptation methods like vocal tract length normalization (VTLN), maximum likelihood linear regression (MLLR) and heteroscedastic linear discriminant analysis (HLDA). But a significant gap in the performance of adults and children still remains. This work explores the eigenvoices (EV) based adaptation for addressing the gap in recognition performance of children's speech on adults' speech trained acoustic models. EV is a fast adaptation approach and helps in an effective gender biasing of the acoustic models. On combining EV with VTLN, MLLR and HLDA, under mismatched condition an absolute improvement of about 50% over the unadapted speaker independent system performance is obtained and thus significantly reducing the gap between the performances for adults and children.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.