Abstract
The aim of this work is to improve the recognition performance of spontaneous speech. In order to achieve the purpose, the authors of this chapter propose new approaches of unsupervised adaptation for spontaneous speech and evaluate the methods by using diagonal-covariance and full-covariance hidden Markov models. In the adaptation procedure, both methods of language model (LM) adaptation and acoustic model (AM) adaptation are used iteratively. Several combination methods are tested to find the optimal approach. In the LM adaptation, a word trigram model and a part-of-speech (POS) trigram model are combined to build a more task-specific LM. In addition, the authors propose an unsupervised speaker adaptation technique based on adaptation data weighting. The weighting is performed depending on POS class. In Japan, a large-scale spontaneous speech database “Corpus of Spontaneous Japanese (CSJ)” has been used as the common evaluation database for spontaneous speech and the authors used it for their recognition experiments. From the results, the proposed methods demonstrated a significant advantage in that task.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.