Abstract

This paper introduces two novel techniques for rapid speaker adaptation, reference speaker weighting and consistency modeling. Also presented is an adaptation technique called speaker cluster weighting (SCW) which provides a means for improving upon generic hierarchical speaker clustering techniques. Each of these adaptation methods attempts to utilize the underlying within-speaker correlations that are present between the acoustic realizations of different phones. By accounting for these correlations, a limited amount of adaptation data can be used to adapt the models of every phonetic acoustic model, including those for phones which have not been observed in the adaptation data. Results were obtained using the DARPA Resource Management corpus for a set of rapid adaptation experiments where single test utterances were used for adaptation and recognition simultaneously. Using the new adaptation techniques relative word error rate reductions ranging from 4.9% to 8.4% were obtained under various conditions. Using a combination of hierarchical speaker clustering techniques and the novel adaptation techniques, a word error rate reduction of 20% has been achieved from the baseline speaker independent (SI) recognition system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.